Quantile regression generalizes the concept of a univariate quantile to a conditional quantile given one or more covariates.
Recall that a student’s score on a test is at the quantile if his or her score is better than that of
of the students who took the test. The score is also said to be at the 100
percentile.
For a random variable Y with probability distribution function
![]() |
the quantile of Y is defined as the inverse function
![]() |
where . In particular, the median is
.
For a random sample of Y, it is well known that the sample median minimizes the sum of absolute deviations
![]() |
Likewise, the general sample quantile
, which is the analog of
, is formulated as the minimizer
![]() |
where ,
, and where
denotes the indicator function. The loss function
assigns a weight of
to positive residuals
and a weight of
to negative residuals.
Using this loss function, the linear conditional quantile function extends the sample quantile
to the regression setting in the same way that the linear conditional mean function extends the sample mean. Recall that
OLS regression estimates the linear conditional mean function
by solving for
![]() |
The estimated parameter minimizes the sum of squared residuals in the same way that the sample mean
minimizes the sum of squares:
![]() |
Likewise, quantile regression estimates the linear conditional quantile function, , by solving
![]() |
for . The quantity
is called the
regression quantile. The case
, which minimizes the sum of absolute residuals, corresponds to median regression, which is also known as
regression.
The set of regression quantiles
![]() |
is referred to as the quantile process.
The QUANTREG procedure computes the quantile function and conducts statistical inference on the estimated parameters
.