Buja, Hastie, and Tibshirani (1989) showed that each smoothing function estimate from the backfitting algorithm is the result of a linear mapping applied to the working response, if the backfitting algorithm converges.
The smoothing function estimate can be expressed as
|
where is the jth covariate and is the adjusted dependent variable that is formed in the local scoring algorithm. If the errors are independent and identically distributed, then
|
where .
However, direct computation of is formidable within the backfitting framework. Hastie and Tibshirani (1990) proposed using each individual smoothing matrix as a substitute for the linear operator when computing confidence intervals. In the GAM procedure, curvewise confidence intervals for smoothing splines and pointwise confidence intervals for loess are provided in the output data set.
Viewing the spline model as a Bayesian model, Wahba (1983) proposes Bayesian confidence intervals for smoothing spline estimates as:
|
where is the ith diagonal element of the Bayesian posterior covariance matrix and is the quantile of the standard normal distribution. The confidence intervals are interpreted as intervals “across the function” as opposed to pointwise intervals.
Suppose that you fit a spline estimate to experimental data that consist of a true function f and a random error term . In repeated experiments, it is likely that about of the confidence intervals cover the corresponding true values, although some values are covered every time and other values are not covered by the confidence intervals most of the time. This effect is more pronounced when the true response curve or surface has small regions of particularly rapid change.
In the GAM procedure, let the smoothing matrix for the nonlinear part of the jth spline term be after the linear part is separated out from . The Bayesian posterior variance for the nonlinear part is computed as
|
where is the dispersion parameter estimate and is the weight matrix from the final local scoring iteration. If you specify UCLM, LCLM, ADIAG, and STD options in the OUTPUT statement, the statistics are derived based on .
When you request both the ADDITIVE and CLM suboptions in the PLOTS=COMPONENTS option, each of the smoothing component plots displays a confidence band for the total contribution of each smoothing spline smoother. The confidence band is derived from the total variance that is contributed by both linear and nonlinear parts by the jth term
|
As shown in Cleveland, Devlin, and Grosse (1988), the smoothing matrix for a loess smoother is asymmetric. The confidence intervals are computed as follows:
|
where is the ith diagonal element of the covariance matrix and is the quantile of the standard normal distribution.
In the GAM procedure, let the smoothing matrix for the nonlinear part of the jth loess term be after the linear part is separated out from . The covariance matrix for the nonlinear part is then
|
where is the dispersion parameter estimate and is the weight matrix from the final local scoring iteration. If you specify UCLM, LCLM, and STD options in the OUTPUT statement, the statistics are derived based on .
When you request both the ADDITVE and CLM suboptions in the PLOTS=COMPONENTS option, each of the smoothing component plots displays confidence intervals for total prediction of each loess smoother. The confidence intervals are derived from the total variance that is contributed by both the linear and nonlinear parts by the jth term
|