quantreg

Quantile regression fitting and kernel standard errors. quantreg() solves the standard LP formulation via the HiGHS interior-point solver. quantreg_ker_se() computes kernel-based SEs matching R’s summary.rq(se="ker"). Accepts any narwhals-compatible DataFrame (pandas, polars, …).

quantreg

quantreg(formula, data, tau=0.5)[source]

Fit a quantile regression model using a formulaic formula string.

Solves the standard LP:

min  tau * 1'u + (1-tau) * 1'v
s.t. X beta + u - v = y,  u >= 0, v >= 0

using scipy.optimize.linprog with the HiGHS interior-point solver.

Parameters:
  • formula – Formula string, e.g. "y ~ x1 + x2".

  • data – DataFrame containing all variables. Any narwhals-compatible frame (pandas, polars, …) is accepted.

  • tau – Quantile level in (0, 1). Default 0.5 (median regression).

Return type:

QuantRegResult

Returns:

QuantRegResult – Contains .params (named coefficients), .resid, .fittedvalues, and .tau. Call .ker_se() for kernel standard errors.

Parameters:
  • formula (str)

  • data (Any)

  • tau (float)

Examples

>>> qr = interlace.quantreg("y ~ x1 + x2", df, tau=0.75)
>>> qr.params

QuantRegResult attributes

Attribute

Type

Description

params

pd.Series

Named coefficient estimates

resid

np.ndarray (n,)

Residuals y X β̂

fittedvalues

np.ndarray (n,)

Fitted values X β̂

tau

float

Quantile level used for fitting

QuantRegResult methods

ker_se(hs=True) — Kernel standard errors for the coefficients. Delegates to quantreg_ker_se() using the stored residuals and design matrix. hs=True (default) uses the Hall-Sheather bandwidth; hs=False uses Bofinger. Returns np.ndarray of shape (p,).

predict(data) — Predict on new data by re-evaluating the RHS formula. Accepts any narwhals-compatible frame. Returns np.ndarray of shape (n_new,).

Examples

import interlace

# Median regression (tau=0.5)
result = interlace.quantreg("score ~ age + education", data=df)
print(result.params)
print(result.tau)   # 0.5

# 90th percentile regression
result_90 = interlace.quantreg("score ~ age + education", data=df, tau=0.9)

# Kernel standard errors
se = result.ker_se()           # Hall-Sheather bandwidth
se_bof = result.ker_se(hs=False)  # Bofinger bandwidth

# Prediction
import pandas as pd
new_df = pd.DataFrame({"age": [30, 40], "education": [16, 18]})
preds = result.predict(new_df)

quantreg_ker_se / ols_dfbetas_qr

Low-level utilities used internally by the influence diagnostics pipeline. Exposed publicly for users who need to compute kernel-based standard errors or QR-based DFBETAS on their own quantile regression fits.

quantreg_ker_se(residuals, X, tau=0.5, hs=True)[source]

Quantile regression kernel SE matching R’s summary.rq(se="ker").

Port of R quantreg’s Hendricks-Koenker sandwich kernel estimator: uses a Gaussian kernel density evaluated at each residual, with bandwidth derived from the Hall-Sheather (or Bofinger) formula scaled to data units.

Parameters:
  • residuals (array-like, shape (n,)) – QR residuals y X @ beta_hat.

  • X (array-like, shape (n, p)) – Design matrix (including intercept column if present).

  • tau (quantile level (default 0.5))

  • hs (use Hall-Sheather bandwidth (True, default) or Bofinger (False).)

Return type:

ndarray

Returns:

se (ndarray, shape (p,)) – Standard errors for each coefficient, matching R’s kernel SE.

Raises:

ValueError – If the bandwidth is too large for the given sample size and tau.

Parameters:
  • residuals (ndarray)

  • X (ndarray)

  • tau (float)

  • hs (bool)

Examples

>>> qr = interlace.quantreg("y ~ x", df, tau=0.5)
>>> se = interlace.quantreg_ker_se(qr.resid, qr._X, tau=0.5)
ols_dfbetas_qr(model)[source]

Compute DFBETAS for an OLS model via QR decomposition (no Python loops).

Implements the exact closed-form formula using the Sherman-Morrison-Woodbury identity and thin QR decomposition, matching R’s influence.measures() convention (LOO sigma in the denominator).

For a design matrix X = QR (thin QR) with residuals e and MSE s²:

  • Hat diagonal: hᵢ = ‖Qᵢ‖²

  • LOO sigma²: s²ᵢ = (s²(n−p) − eᵢ²/(1−hᵢ)) / (n−p−1)

  • C = R⁻¹Qᵀ (p×n), the “influence matrix” (X’X)⁻¹Xᵀ

  • se_coef[j] = ‖row j of R⁻¹‖ = √(diag[(X’X)⁻¹]ⱼ)

  • DFBETAS[i,j] = C[j,i] · eᵢ / ((1−hᵢ) · sᵢ · se_coef[j])

Parameters:

model – A fitted statsmodels RegressionResultsWrapper (OLS).

Return type:

ndarray

Returns:

np.ndarray of shape (n, p) – DFBETAS matrix, one row per observation, one column per parameter.

Parameters:

model (Any)

References

Belsley, Kuh & Welsch (1980). Regression Diagnostics. Wiley. R’s stats::dfbetas.lm / stats::influence.measures.

Examples

>>> dfb = interlace.ols_dfbetas_qr(ols_result)

quantreg_ker_se replicates R’s quantreg::summary.rq(se="ker") using the Hendricks-Koenker Gaussian kernel sandwich estimator:

  1. Compute data-scale bandwidth h_data = (Φ⁻¹(τ+h) Φ⁻¹(τ−h)) × min(σ̂, IQR/1.34) where h is the Hall-Sheather or Bofinger quantile bandwidth.

  2. Estimate per-observation density fᵢ = φ(rᵢ / h_data) / h_data.

  3. Form sandwich covariance Cov(β̂) = τ(1−τ) × (X'diag(f)X)⁻¹ X'X (X'diag(f)X)⁻¹.

ols_dfbetas_qr replicates the DFBETAS diagnostic from car::dfbetas(). Both are used internally by hlm_influence() and lmer_influence_measures().

See also

  • ols — OLS fitting with HC3 robust standard errors

  • Influence diagnostics — high-level influence diagnostics

  • Augment — combined augmented DataFrame with .cooksd and .mdffits