quantreg¶
Quantile regression fitting and kernel standard errors.
quantreg() solves the standard LP formulation via the HiGHS interior-point solver.
quantreg_ker_se() computes kernel-based SEs matching R’s summary.rq(se="ker").
Accepts any narwhals-compatible DataFrame (pandas, polars, …).
quantreg¶
- quantreg(formula, data, tau=0.5)[source]¶
Fit a quantile regression model using a formulaic formula string.
Solves the standard LP:
min tau * 1'u + (1-tau) * 1'v s.t. X beta + u - v = y, u >= 0, v >= 0
using
scipy.optimize.linprogwith the HiGHS interior-point solver.- Parameters:
formula – Formula string, e.g.
"y ~ x1 + x2".data – DataFrame containing all variables. Any narwhals-compatible frame (pandas, polars, …) is accepted.
tau – Quantile level in (0, 1). Default 0.5 (median regression).
- Return type:
QuantRegResult- Returns:
QuantRegResult – Contains
.params(named coefficients),.resid,.fittedvalues, and.tau. Call.ker_se()for kernel standard errors.- Parameters:
formula (str)
data (Any)
tau (float)
Examples
>>> qr = interlace.quantreg("y ~ x1 + x2", df, tau=0.75) >>> qr.params
QuantRegResult attributes¶
Attribute |
Type |
Description |
|---|---|---|
|
|
Named coefficient estimates |
|
|
Residuals |
|
|
Fitted values |
|
|
Quantile level used for fitting |
QuantRegResult methods¶
ker_se(hs=True) — Kernel standard errors for the coefficients.
Delegates to quantreg_ker_se() using the stored residuals and design matrix.
hs=True (default) uses the Hall-Sheather bandwidth; hs=False uses Bofinger.
Returns np.ndarray of shape (p,).
predict(data) — Predict on new data by re-evaluating the RHS formula.
Accepts any narwhals-compatible frame.
Returns np.ndarray of shape (n_new,).
Examples¶
import interlace
# Median regression (tau=0.5)
result = interlace.quantreg("score ~ age + education", data=df)
print(result.params)
print(result.tau) # 0.5
# 90th percentile regression
result_90 = interlace.quantreg("score ~ age + education", data=df, tau=0.9)
# Kernel standard errors
se = result.ker_se() # Hall-Sheather bandwidth
se_bof = result.ker_se(hs=False) # Bofinger bandwidth
# Prediction
import pandas as pd
new_df = pd.DataFrame({"age": [30, 40], "education": [16, 18]})
preds = result.predict(new_df)
quantreg_ker_se / ols_dfbetas_qr¶
Low-level utilities used internally by the influence diagnostics pipeline. Exposed publicly for users who need to compute kernel-based standard errors or QR-based DFBETAS on their own quantile regression fits.
- quantreg_ker_se(residuals, X, tau=0.5, hs=True)[source]¶
Quantile regression kernel SE matching R’s
summary.rq(se="ker").Port of R quantreg’s Hendricks-Koenker sandwich kernel estimator: uses a Gaussian kernel density evaluated at each residual, with bandwidth derived from the Hall-Sheather (or Bofinger) formula scaled to data units.
- Parameters:
residuals (array-like, shape (n,)) – QR residuals
y − X @ beta_hat.X (array-like, shape (n, p)) – Design matrix (including intercept column if present).
tau (quantile level (default 0.5))
hs (use Hall-Sheather bandwidth (True, default) or Bofinger (False).)
- Return type:
ndarray- Returns:
se (ndarray, shape (p,)) – Standard errors for each coefficient, matching R’s kernel SE.
- Raises:
ValueError – If the bandwidth is too large for the given sample size and tau.
- Parameters:
residuals (ndarray)
X (ndarray)
tau (float)
hs (bool)
Examples
>>> qr = interlace.quantreg("y ~ x", df, tau=0.5) >>> se = interlace.quantreg_ker_se(qr.resid, qr._X, tau=0.5)
- ols_dfbetas_qr(model)[source]¶
Compute DFBETAS for an OLS model via QR decomposition (no Python loops).
Implements the exact closed-form formula using the Sherman-Morrison-Woodbury identity and thin QR decomposition, matching R’s
influence.measures()convention (LOO sigma in the denominator).For a design matrix X = QR (thin QR) with residuals e and MSE s²:
Hat diagonal: hᵢ = ‖Qᵢ‖²
LOO sigma²: s²ᵢ = (s²(n−p) − eᵢ²/(1−hᵢ)) / (n−p−1)
C = R⁻¹Qᵀ (p×n), the “influence matrix” (X’X)⁻¹Xᵀ
se_coef[j] = ‖row j of R⁻¹‖ = √(diag[(X’X)⁻¹]ⱼ)
DFBETAS[i,j] = C[j,i] · eᵢ / ((1−hᵢ) · sᵢ · se_coef[j])
- Parameters:
model – A fitted statsmodels
RegressionResultsWrapper(OLS).- Return type:
ndarray- Returns:
np.ndarray of shape (n, p) – DFBETAS matrix, one row per observation, one column per parameter.
- Parameters:
model (Any)
References
Belsley, Kuh & Welsch (1980). Regression Diagnostics. Wiley. R’s
stats::dfbetas.lm/stats::influence.measures.Examples
>>> dfb = interlace.ols_dfbetas_qr(ols_result)
quantreg_ker_se replicates R’s quantreg::summary.rq(se="ker") using the
Hendricks-Koenker Gaussian kernel sandwich estimator:
Compute data-scale bandwidth
h_data = (Φ⁻¹(τ+h) − Φ⁻¹(τ−h)) × min(σ̂, IQR/1.34)wherehis the Hall-Sheather or Bofinger quantile bandwidth.Estimate per-observation density
fᵢ = φ(rᵢ / h_data) / h_data.Form sandwich covariance
Cov(β̂) = τ(1−τ) × (X'diag(f)X)⁻¹ X'X (X'diag(f)X)⁻¹.
ols_dfbetas_qr replicates the DFBETAS diagnostic from car::dfbetas().
Both are used internally by hlm_influence() and lmer_influence_measures().
See also¶
ols — OLS fitting with HC3 robust standard errors
Influence diagnostics — high-level influence diagnostics
Augment — combined augmented DataFrame with
.cooksdand.mdffits