Let Y be the variable of interest in a complex survey. Denote as the cumulative distribution function of Y. For , the pth quantile of the population cumulative distribution function is
Let be the observed values for variable Y that are associated with sampling weights, where are the stratum index, cluster index, and member index, respectively, as shown in the section Definitions and Notation. Let denote the sample order statistics for variable Y.
An estimate of quantile is
where is the estimated cumulative distribution for Y,
and is the indicator function.
When you specify VARMETHOD=TAYLOR, or by default if you do not specify the VARMETHOD= option, PROC SURVEYMEANS uses Woodruff’s method (Dorfman and Valliant, 1993; Särndal, Swensson, and Wretman, 1992; Francisco and Fuller, 1991) to estimate the variances of quantiles. This method first constructs a confidence interval on a quantile. Then it uses the width of the confidence interval to estimate the standard error of a quantile.
In order to estimate the variance of , PROC SURVEYMEANS first estimates the variance of the estimated distribution function by
where
Then % confidence limits for can be constructed by
where is the percentile of the t distribution with df degrees of freedom, described in the section Degrees of Freedom.
When is out of the range of [0,1], the procedure does not compute the standard error of .
The th quantile is defined as
and the th quantile is defined as
The standard error of is then estimated by
where is the percentile of the t distribution with df degrees of freedom.
When you use the replication method, PROC SURVEYMEANS uses the usual variance estimates for a quantile as described in the section Replication Methods for Variance Estimation. However, you should proceed cautiously, because this variance estimator can have poor properties (Dorfman and Valliant, 1993).
Symmetric % confidence limits are computed as
If you specify the NONSYMCL option in the PROC SURVEYMEANS statement when you use the VARMETHOD=TAYLOR option, the procedure computes % nonsymmetric confidence limits:
When you specify a POSTSTRATA statement, the quantile estimation and its variance estimation incorporate poststratification. For more information about poststratification, see the section Poststratification.
For a selected sample, let be the poststratum index; let be the population totals for each corresponding poststratum, and let be the indicator variable for the poststratum r that is defined by
Denote the total sum of original weights in the sample for each poststratum as
Assume that the observation (h, i, j) belongs to the rth poststratum. Then the poststratification weight for the observation (h, i, j) is
Then the estimated cumulative distribution function of Y, and the estimated pth quantile estimation can be computed as in the section Estimate of Quantile by replacing the original weights, , with the poststratification weights, .
When you specify VARMETHOD=TAYLOR (or by default), the variance of is estimated as in the section Standard Error, except that the variance of the estimated distribution function is computed as follows.
For each poststratum , define
where is the indicator function.
Assume that the observation (h, i, j) belongs to the rth poststratum. Let
PROC SURVEYMEANS estimates the variance of the estimated distribution function with poststratification by
where
Let Y be the variable of interest in a complex survey, and let a subpopulation of interest be domain D. Denote as the cumulative distribution function of Y in domain D. For , the pth quantile of the population cumulative distribution function is
Let be the corresponding indicator variable:
Assume that there are a total of d observations among the n observations in the entire sample that belong to domain D. Let denote the order statistics of variable Y for these d observations that fall in domain D.
The cumulative distribution function of Y in domain D is estimated by
and is the indicator function. Then the estimated quantile in domain D is
In order to estimate the variance for , PROC SURVEYMEANS first estimates the variance of the estimated distribution function in domain D. When you specify VARMETHOD=TAYLOR (or by default), the variance of is estimated by
where
Then % confidence limits for can be constructed by , where
and is the percentile of the t distribution with df degrees of freedom, described in the section Degrees of Freedom. When is out of the range of [0,1], PROC SURVEYMEANS does not compute the standard error of .
The th quantile is then estimated as
The th quantile is then estimated as
The standard error of is then estimated by
where is the percentile of the t distribution with df degrees of freedom.
Symmetric % confidence limits for are computed as
If you specify the NONSYMCL option in the PROC SURVEYMEANS statement, the procedure displays % nonsymmetric confidence limits as
When you specify both a POSTSTRATA statement and a DOMAIN statement, the domain quantile estimation and its variance estimation incorporate poststratification. For more information about poststratification, see the section Poststratification.
For a selected sample, let be the poststratum index, let be the population totals for each corresponding poststratum, and let be the indicator variable for the poststratum r:
The poststratification weights, , are defined as in the section Quantile Estimation with Poststratification.
For domain D, let be the corresponding indicator variable:
With poststratification, for variable Y, the estimated cumulative distribution in domain D, , and its pth quantile estimation, , can be computed as in the section Domain Quantile by replacing the original weights, , with the poststratification weights, . However, the variance of , which is described in the section Domain Quantile, is computed as follows when you specify the VARMETHOD=TAYLOR option (or by default).
Define
Assume that the observation (h, i, j) belongs to the rth poststratum. Then the variance of is estimated by