Quickstart¶
Build a dataset¶
interlace works with any pandas DataFrame. For this example, we generate synthetic exam scores where students are nested within schools — a classic crossed random effects structure.
import numpy as np
import pandas as pd
rng = np.random.default_rng(42)
n = 300
# Crossed structure: 30 students × 10 schools
student_ids = [f"s{i}" for i in rng.integers(1, 31, n)]
school_ids = [f"sch{i}" for i in rng.integers(1, 11, n)]
df = pd.DataFrame({
"score": 60 + rng.normal(0, 8, n),
"hours_studied": rng.uniform(0, 10, n),
"prior_gpa": rng.uniform(2.0, 4.0, n),
"student_id": student_ids,
"school_id": school_ids,
})
Fit the model¶
Pass a standard fixed-effects formula and the grouping column names:
from interlace import fit
result = fit(
formula="score ~ hours_studied + prior_gpa",
data=df,
groups=["student_id", "school_id"], # crossed random intercepts
)
groups accepts a single string (one random intercept) or a list (crossed intercepts).
The first entry is the primary grouping factor.
Inspect results¶
# Fixed-effect coefficients
print(result.fe_params)
# Intercept 55.12
# hours_studied 0.81
# prior_gpa 1.23
# Standard errors and p-values
print(result.fe_bse)
print(result.fe_pvalues)
# Variance components per grouping factor
print(result.variance_components)
# {'student_id': 12.4, 'school_id': 5.8}
# Residual variance σ²
print(result.scale)
# Model fit
print(result.aic, result.bic)
Access random effects¶
# BLUPs (Best Linear Unbiased Predictors) for each factor
student_blups = result.random_effects["student_id"]
school_blups = result.random_effects["school_id"]
print(student_blups.head())
# s1 1.23
# s2 -0.87
# ...
Predict on new data¶
df_new = pd.DataFrame({
"hours_studied": [5.0, 8.0],
"prior_gpa": [3.2, 3.8],
"student_id": ["s1", "s_new"], # s_new is unseen → shrinks to 0
"school_id": ["sch1", "sch2"],
})
preds = result.predict(newdata=df_new)
print(preds)
Unseen group levels automatically shrink to the population mean (BLUP = 0).
groups vs random: choosing the right parameter¶
groups is shorthand for random intercepts only — the common case. Pass a string
or list of column names and interlace adds one random intercept per factor:
# Equivalent to lme4: rt ~ x + (1|subject) + (1|item)
result = fit("score ~ hours_studied", data=df, groups=["student_id", "school_id"])
random accepts lme4-style Wilkinson notation and is required when you need
random slopes (each group gets its own slope for a predictor) or when you want
to mix intercept-only and slope terms across factors:
# Equivalent to lme4: score ~ x + (1+x|student_id) + (1|school_id)
result = fit(
"score ~ hours_studied",
data=df,
random=["(1 + hours_studied | student_id)", "(1 | school_id)"],
)
Quick rule: start with groups=. Switch to random= if you have
subject-by-predictor interactions, or if lme4 model comparison suggests the
slope variance is non-negligible. See the Random Slopes Guide
for a full walkthrough.
Random slopes¶
result = fit(
formula="score ~ hours_studied + prior_gpa",
data=df,
random=["(1 + hours_studied | student_id)", "(1 | school_id)"],
)
# BLUPs are now a DataFrame, one column per term
print(result.random_effects["student_id"])
# hours_studied
# s1 0.42
# s2 -0.31
# ...
Nested designs¶
Use the lme4 / nesting shorthand in the random parameter.
(1|batch/cask) expands to (1|batch) + (1|batch:cask), matching lme4 exactly:
result = interlace.fit(
"strength ~ 1",
data=df,
random=["(1|batch/cask)"],
)
# random_effects has one entry per expanded term
batch_blups = result.random_effects["batch"]
batch_cask_blups = result.random_effects["batch:cask"]
# variance_components likewise
print(result.variance_components["batch"])
print(result.variance_components["batch:cask"])
Depth-3 nesting ((1|a/b/c)) is also supported and expands to three terms:
(1|a), (1|a:b), (1|a:b:c).
Bootstrap standard error¶
CrossedLMEResult.bootstrap_se() computes a cluster-bootstrap SE for the
median of the response — useful for EU pay-gap reporting:
se = result.bootstrap_se(statistic="median", n_bootstrap=1000, seed=42)
print(f"Median SE (cluster bootstrap): {se:.4f}")
Non-normal outcomes¶
For binary, proportion, or count outcomes, use interlace.glmer() instead of
fit(). See the GLMM quickstart for a full walkthrough.
For ordinal outcomes, use interlace.clmm() — see the clmm reference.
For survival data with frailty, use interlace.coxme() — see the coxme reference.
Next steps¶
See GLMM quickstart for binomial, Poisson, and other GLMMs
See clmm for ordinal regression with random effects
See coxme for Cox frailty models
See Examples for a full walkthrough of diagnostics and plots
See Random Slopes Guide for a detailed guide on random slopes syntax and interpretation
See Model Comparison for comparing models with likelihood ratio tests
See Correlation structures for AR(1) and compound symmetry correlation structures
See the fit reference for all
fit()parametersSee CrossedLMEResult for the complete list of
CrossedLMEResultattributesSee Installation for optional extras (CHOLMOD, BOBYQA)