Cox frailty quickstart

This page walks through fitting Cox proportional hazards models with Gaussian frailty (shared frailty models) using interlace.coxme(). These models are for time-to-event data where observations are clustered within groups — patients within hospitals, recurrent events within subjects, survival times within treatment centres.

For continuous outcomes, see the LMM quickstart. For binary or count outcomes, see the GLMM quickstart. For ordinal outcomes, see the CLMM quickstart.


When to use a Cox frailty model

Use coxme() when:

  • Your outcome is time-to-event (survival time, duration, time-to-failure) with possible right censoring

  • Observations are clustered within groups (centres, subjects, families) and you want to account for unobserved group-level heterogeneity

  • You would otherwise fit a standard Cox PH model (coxph) but need random effects for the grouping structure

The “frailty” is the survival analysis term for a random effect — it captures unobserved group-level factors that make some groups more or less prone to the event.


The model

A Cox frailty model extends the standard Cox PH model with a group-level random effect:

h(t | x_i, b_j) = h_0(t) * exp(x_i' beta + b_j)
  • h_0(t) is the unspecified baseline hazard (no distributional assumption)

  • beta are log hazard ratios for the fixed-effect covariates

  • b_j ~ N(0, sigma^2) are the group-level frailties

Groups with b_j > 0 have higher hazard (faster event occurrence); groups with b_j < 0 have lower hazard (slower event occurrence).


Basic example: multi-centre clinical trial

Patients are treated at different hospitals. We want to estimate the treatment effect while accounting for hospital-level variability in patient outcomes.

import interlace
import pandas as pd

result = interlace.coxme(
    formula="Surv(time, event) ~ treatment + age",
    data=df,
    groups="hospital",
)

The Surv(time, event) syntax mirrors R — time is the follow-up duration and event is an indicator (1 = event occurred, 0 = censored).

Inspect results

# Log hazard ratios
print(result.fe_params)
# treatment    -0.42
# age           0.03

# Standard errors and p-values
print(result.fe_bse)
print(result.fe_pvalues)

# 95% confidence intervals
print(result.fe_conf_int)
#              2.5 %  97.5 %
# treatment   -0.71   -0.13
# age          0.01    0.05

# Frailty variance (between-hospital heterogeneity)
print(result.variance_components)
# {'hospital': 0.18}

# Concordance (C-statistic)
print(f"C-index: {result.concordance:.3f}")

# Model fit
print(f"AIC: {result.aic:.1f}  BIC: {result.bic:.1f}")

Interpreting fixed effects

Coefficients are log hazard ratios. To get hazard ratios:

import numpy as np

print("Hazard ratios:")
for name in result.fe_params.index:
    beta = result.fe_params[name]
    ci = result.fe_conf_int.loc[name]
    print(f"  {name}: HR = {np.exp(beta):.3f} "
          f"[{np.exp(ci['2.5 %']):.3f}, {np.exp(ci['97.5 %']):.3f}]")
# treatment: HR = 0.657 [0.492, 0.878]  — treatment reduces hazard by 34%
# age:       HR = 1.030 [1.010, 1.051]  — each year increases hazard by 3%

An HR < 1 means the covariate is protective (reduces hazard); HR > 1 means it increases hazard.

Interpreting frailty

The frailty variance (result.variance_components) measures how much hospitals differ in their baseline event rates after adjusting for covariates:

  • Large sigma^2 — substantial unexplained hospital-level variation; patients at some hospitals do systematically better or worse

  • Small sigma^2 (near zero) — hospitals are similar after adjusting for patient characteristics; a standard Cox PH model may suffice

Individual hospital frailties (BLUPs) are available in result.random_effects:

print(result.random_effects["hospital"])
# hosp_01    0.32    — higher-than-average hazard
# hosp_02   -0.18    — lower-than-average hazard
# ...

Prediction

predict() supports two output types:

Linear predictor

# x'beta + z'b (in-sample)
lp = result.predict()

# On new data
lp_new = result.predict(newdata=df_new)

# Fixed effects only (no frailty — for new hospitals)
lp_marginal = result.predict(newdata=df_new, include_re=False)

Hazard ratios

# exp(x'beta + z'b)
hr = result.predict(type="risk")
hr_new = result.predict(newdata=df_new, type="risk")

An HR of 2.0 means that observation has twice the instantaneous hazard of the baseline.

Unseen groups

Unseen group levels at prediction time receive a frailty of zero (population-average prediction):

new_data = pd.DataFrame({
    "time": [12.0, 24.0],
    "event": [1, 0],
    "treatment": [1, 0],
    "age": [55, 62],
    "hospital": ["new_hosp_A", "new_hosp_B"],
})
hr = result.predict(newdata=new_data, type="risk")

Baseline hazard

The Breslow estimate of the cumulative baseline hazard is available as a DataFrame:

print(result.baseline_hazard.head())
#    time    hazard
# 0  0.5     0.012
# 1  1.0     0.031
# 2  1.5     0.055
# ...

Combined with the linear predictor, this gives the full survival curve for any observation.


Crossed frailty

Like fit() and glmer(), coxme() supports crossed random effects. If patients are treated at hospitals by different surgeons, both hospital and surgeon are independent sources of frailty:

result = interlace.coxme(
    "Surv(time, event) ~ treatment + age",
    data=df,
    groups=["hospital", "surgeon"],
)

print(result.variance_components)
# {'hospital': 0.18, 'surgeon': 0.07}

print(result.random_effects["hospital"])
print(result.random_effects["surgeon"])

Concordance (C-statistic)

The concordance index (C-statistic, Harrell’s C) measures the model’s discriminative ability — the probability that for a random pair of observations, the one with the higher predicted risk experiences the event first:

  • C = 0.5 — no better than chance

  • C = 0.7-0.8 — acceptable discrimination

  • C > 0.8 — strong discrimination

print(f"C-index: {result.concordance:.3f}")

Summary

summary() prints a human-readable overview:

print(result.summary())

Comparison with R

interlace

R (coxme)

interlace.coxme("Surv(t,e) ~ x", data, groups="g")

coxme(Surv(t,e) ~ x + (1|g), data)

result.fe_params

fixef(fit)

result.fe_bse

sqrt(diag(vcov(fit)))

result.random_effects["g"]

ranef(fit)$g

result.variance_components

VarCorr(fit)

result.concordance

concordance(fit)$concordance

result.predict(type="risk")

predict(fit, type="risk")

result.baseline_hazard

basehaz(fit)

Parity: fixed effects match R’s coxme::coxme() within 1e-3 (absolute), variance components within 10% (relative). Standard errors use the exact Breslow Hessian (not diagonal approximation), matching R to < 1%.


When to use coxme vs other models

Your data

Use

Time-to-event with grouped observations

coxme()

Time-to-event without grouping

Standard Cox PH (e.g. lifelines.CoxPHFitter)

Ordinal outcomes (ratings, severity)

clmm()

Binary outcomes (success/failure)

glmer(family="binomial")

Continuous outcomes

fit()


Next steps