# Cox frailty quickstart This page walks through fitting Cox proportional hazards models with Gaussian frailty (shared frailty models) using `interlace.coxme()`. These models are for **time-to-event data** where observations are clustered within groups — patients within hospitals, recurrent events within subjects, survival times within treatment centres. For continuous outcomes, see the [LMM quickstart](quickstart.md). For binary or count outcomes, see the [GLMM quickstart](glmm-quickstart.md). For ordinal outcomes, see the [CLMM quickstart](clmm-quickstart.md). --- ## When to use a Cox frailty model Use `coxme()` when: - Your outcome is **time-to-event** (survival time, duration, time-to-failure) with possible **right censoring** - Observations are **clustered** within groups (centres, subjects, families) and you want to account for unobserved group-level heterogeneity - You would otherwise fit a standard Cox PH model (`coxph`) but need random effects for the grouping structure The "frailty" is the survival analysis term for a random effect — it captures unobserved group-level factors that make some groups more or less prone to the event. --- ## The model A Cox frailty model extends the standard Cox PH model with a group-level random effect: ``` h(t | x_i, b_j) = h_0(t) * exp(x_i' beta + b_j) ``` - **h_0(t)** is the unspecified baseline hazard (no distributional assumption) - **beta** are log hazard ratios for the fixed-effect covariates - **b_j ~ N(0, sigma^2)** are the group-level frailties Groups with b_j > 0 have higher hazard (faster event occurrence); groups with b_j < 0 have lower hazard (slower event occurrence). --- ## Basic example: multi-centre clinical trial Patients are treated at different hospitals. We want to estimate the treatment effect while accounting for hospital-level variability in patient outcomes. ```python import interlace import pandas as pd result = interlace.coxme( formula="Surv(time, event) ~ treatment + age", data=df, groups="hospital", ) ``` The `Surv(time, event)` syntax mirrors R — `time` is the follow-up duration and `event` is an indicator (1 = event occurred, 0 = censored). ### Inspect results ```python # Log hazard ratios print(result.fe_params) # treatment -0.42 # age 0.03 # Standard errors and p-values print(result.fe_bse) print(result.fe_pvalues) # 95% confidence intervals print(result.fe_conf_int) # 2.5 % 97.5 % # treatment -0.71 -0.13 # age 0.01 0.05 # Frailty variance (between-hospital heterogeneity) print(result.variance_components) # {'hospital': 0.18} # Concordance (C-statistic) print(f"C-index: {result.concordance:.3f}") # Model fit print(f"AIC: {result.aic:.1f} BIC: {result.bic:.1f}") ``` ### Interpreting fixed effects Coefficients are **log hazard ratios**. To get hazard ratios: ```python import numpy as np print("Hazard ratios:") for name in result.fe_params.index: beta = result.fe_params[name] ci = result.fe_conf_int.loc[name] print(f" {name}: HR = {np.exp(beta):.3f} " f"[{np.exp(ci['2.5 %']):.3f}, {np.exp(ci['97.5 %']):.3f}]") # treatment: HR = 0.657 [0.492, 0.878] — treatment reduces hazard by 34% # age: HR = 1.030 [1.010, 1.051] — each year increases hazard by 3% ``` An HR < 1 means the covariate is protective (reduces hazard); HR > 1 means it increases hazard. ### Interpreting frailty The frailty variance (`result.variance_components`) measures how much hospitals differ in their baseline event rates after adjusting for covariates: - **Large sigma^2** — substantial unexplained hospital-level variation; patients at some hospitals do systematically better or worse - **Small sigma^2 (near zero)** — hospitals are similar after adjusting for patient characteristics; a standard Cox PH model may suffice Individual hospital frailties (BLUPs) are available in `result.random_effects`: ```python print(result.random_effects["hospital"]) # hosp_01 0.32 — higher-than-average hazard # hosp_02 -0.18 — lower-than-average hazard # ... ``` --- ## Prediction `predict()` supports two output types: ### Linear predictor ```python # x'beta + z'b (in-sample) lp = result.predict() # On new data lp_new = result.predict(newdata=df_new) # Fixed effects only (no frailty — for new hospitals) lp_marginal = result.predict(newdata=df_new, include_re=False) ``` ### Hazard ratios ```python # exp(x'beta + z'b) hr = result.predict(type="risk") hr_new = result.predict(newdata=df_new, type="risk") ``` An HR of 2.0 means that observation has twice the instantaneous hazard of the baseline. ### Unseen groups Unseen group levels at prediction time receive a frailty of zero (population-average prediction): ```python new_data = pd.DataFrame({ "time": [12.0, 24.0], "event": [1, 0], "treatment": [1, 0], "age": [55, 62], "hospital": ["new_hosp_A", "new_hosp_B"], }) hr = result.predict(newdata=new_data, type="risk") ``` --- ## Baseline hazard The Breslow estimate of the cumulative baseline hazard is available as a DataFrame: ```python print(result.baseline_hazard.head()) # time hazard # 0 0.5 0.012 # 1 1.0 0.031 # 2 1.5 0.055 # ... ``` Combined with the linear predictor, this gives the full survival curve for any observation. --- ## Crossed frailty Like `fit()` and `glmer()`, `coxme()` supports crossed random effects. If patients are treated at hospitals by different surgeons, both hospital and surgeon are independent sources of frailty: ```python result = interlace.coxme( "Surv(time, event) ~ treatment + age", data=df, groups=["hospital", "surgeon"], ) print(result.variance_components) # {'hospital': 0.18, 'surgeon': 0.07} print(result.random_effects["hospital"]) print(result.random_effects["surgeon"]) ``` --- ## Concordance (C-statistic) The concordance index (C-statistic, Harrell's C) measures the model's discriminative ability — the probability that for a random pair of observations, the one with the higher predicted risk experiences the event first: - **C = 0.5** — no better than chance - **C = 0.7-0.8** — acceptable discrimination - **C > 0.8** — strong discrimination ```python print(f"C-index: {result.concordance:.3f}") ``` --- ## Summary `summary()` prints a human-readable overview: ```python print(result.summary()) ``` --- ## Comparison with R | interlace | R (coxme) | |-----------|-----------| | `interlace.coxme("Surv(t,e) ~ x", data, groups="g")` | `coxme(Surv(t,e) ~ x + (1\|g), data)` | | `result.fe_params` | `fixef(fit)` | | `result.fe_bse` | `sqrt(diag(vcov(fit)))` | | `result.random_effects["g"]` | `ranef(fit)$g` | | `result.variance_components` | `VarCorr(fit)` | | `result.concordance` | `concordance(fit)$concordance` | | `result.predict(type="risk")` | `predict(fit, type="risk")` | | `result.baseline_hazard` | `basehaz(fit)` | **Parity:** fixed effects match R's `coxme::coxme()` within 1e-3 (absolute), variance components within 10% (relative). Standard errors use the exact Breslow Hessian (not diagonal approximation), matching R to < 1%. --- ## When to use coxme vs other models | Your data | Use | |---|---| | Time-to-event with grouped observations | `coxme()` | | Time-to-event without grouping | Standard Cox PH (e.g. `lifelines.CoxPHFitter`) | | Ordinal outcomes (ratings, severity) | `clmm()` | | Binary outcomes (success/failure) | `glmer(family="binomial")` | | Continuous outcomes | `fit()` | --- ## Next steps - See {doc}`api/coxme` for the full API reference and all parameters - See {doc}`clmm-quickstart` for ordinal regression with random effects - See {doc}`glmm-quickstart` for binomial, Poisson, and other GLMMs - See {doc}`quickstart` for linear mixed models - See {doc}`comparison` for a feature comparison across libraries