Augment¶
Combine residuals, predictions, and influence diagnostics into a single tidy DataFrame —
one row per observation in the original data. Useful for plotting and downstream
analysis. Works with both CrossedLMEResult and statsmodels.MixedLMResults.
- hlm_augment(model, level=1, include_influence=True)[source]¶
Combine residuals and (optionally) influence diagnostics into one DataFrame.
- Parameters:
model – A
CrossedLMEResultor statsmodelsMixedLMResultsobject.level – Reserved for future multi-level support; currently only
1is used.include_influence – If
True(default), append Cook’s D, MDFFITS, COVTRACE, COVRATIO, and RVC columns. Set toFalseto skip the expensive refit loop.
- Return type:
Any- Returns:
Native DataFrame in the same type as the model’s input data.
- Parameters:
model (Any)
level (int)
include_influence (bool)
Examples
>>> aug = interlace.hlm_augment(result) >>> aug.columns # original data + .resid + .fitted + cooksd + ...
Returned columns¶
Column |
Description |
|---|---|
|
Conditional fitted values (fixed + random effects) |
|
Conditional residuals (observed − fitted) |
|
Total hat-matrix diagonal ( |
|
Cook’s distance (case-deletion influence on fixed effects) |
|
MDFFITS (scale-free Cook’s D) |
All original columns from the input DataFrame are preserved.
Example¶
import interlace
import matplotlib.pyplot as plt
result = interlace.fit("rt ~ condition", data=df, groups=["subject", "item"])
aug = interlace.hlm_augment(result)
print(aug.columns.tolist())
# ['subject', 'item', 'condition', 'rt',
# '.fitted', '.resid', '.leverage', '.cooksd', '.mdffits']
# Residual vs fitted plot
aug.plot.scatter(x=".fitted", y=".resid", alpha=0.4)
plt.axhline(0, linestyle="--", color="grey")
plt.xlabel("Fitted values")
plt.ylabel("Residuals")
plt.title("Residuals vs Fitted")
plt.show()
# Flag influential observations
influential = aug[aug[".cooksd"] > 4 / len(aug)]
print(f"{len(influential)} influential observations")
print(influential[["subject", "item", ".cooksd"]].sort_values(".cooksd", ascending=False))
See also¶
Residuals — compute residuals only
Leverage — compute leverage only
Influence diagnostics — full influence diagnostics with additional metrics