The LOGISTIC Procedure

Confidence Intervals for Parameters

There are two methods of computing confidence intervals for the regression parameters. One is based on the profile-likelihood function, and the other is based on the asymptotic normality of the parameter estimators. The latter is not as time-consuming as the former, because it does not involve an iterative scheme; however, it is not thought to be as accurate as the former, especially with small sample size. You use the CLPARM= option to request confidence intervals for the parameters.

Likelihood Ratio-Based Confidence Intervals

The likelihood ratio-based confidence interval is also known as the profile-likelihood confidence interval. The construction of this interval is derived from the asymptotic $\chi ^2$ distribution of the generalized likelihood ratio test (Venzon and Moolgavkar, 1988). Suppose that the parameter vector is $\bbeta = (\beta _{0},\beta _{1},\ldots ,\beta _{s})’$ and you want to compute a confidence interval for $\beta _{j}$. The profile-likelihood function for $\beta _{j}=\gamma $ is defined as

\[  l_ j^*(\gamma ) = \max _{\bbeta \in {\mc{B}}_ j(\gamma )} l(\bbeta )  \]

where ${\mc{B}}_ j(\gamma )$ is the set of all $\bbeta $ with the jth element fixed at $\gamma $, and $l(\bbeta )$ is the log-likelihood function for $\bbeta $. If $l_{\max } = l({\widehat{\bbeta }})$ is the log likelihood evaluated at the maximum likelihood estimate ${\widehat{\bbeta }}$, then $ 2( l_{\max } - l_ j^{*}(\beta _{j} )) $ has a limiting chi-square distribution with one degree of freedom if $\beta _{j}$ is the true parameter value. Let $l_0=l_{\max } - 0.5\chi ^{2}_{1}(1-\alpha )$, where $\chi ^{2}_{1}(1-\alpha )$ is the $100(1-\alpha )$ percentile of the chi-square distribution with one degree of freedom. A $100(1-\alpha )$% confidence interval for $\beta _{j}$ is

\[  \{ \gamma : l_ j^*(\gamma ) \geq l_{0} \}   \]

The endpoints of the confidence interval are found by solving numerically for values of $\beta _{j}$ that satisfy equality in the preceding relation. To obtain an iterative algorithm for computing the confidence limits, the log-likelihood function in a neighborhood of $\bbeta $ is approximated by the quadratic function

\[  \tilde{l}(\bbeta + \bdelta ) = l(\bbeta ) + \bdelta ’\mb{g} + \frac{1}{2}\bdelta ’ \bV \bdelta  \]

where $\mb{g}=\mb{g}(\bbeta )$ is the gradient vector and $\bV =\bV (\bbeta )$ is the Hessian matrix. The increment $\bdelta $ for the next iteration is obtained by solving the likelihood equations

\[  \frac{d}{d\bdelta }\{  \tilde{l}(\bbeta + \bdelta ) + \lambda ( \mb{e}_ j’\bdelta - \gamma )\}  = \bm {0}  \]

where $\lambda $ is the Lagrange multiplier, $\mb{e}_ j$ is the jth unit vector, and $\gamma $ is an unknown constant. The solution is

\[  \bdelta = -\bV ^{-1}(\mb{g} + \lambda \mb{e}_ j)  \]

By substituting this $\bdelta $ into the equation $\tilde{l}(\bbeta + \bdelta ) = l_0$, you can estimate $\lambda $ as

\[  \lambda = \pm \biggl (\frac{2(l_0 - l(\bbeta ) + \frac{1}{2}\mb{g}'\bV ^{-1}\mb{g})}{\mb{e}_ j'\bV ^{-1}\mb{e}_ j}\biggr )^{ \frac{1}{2}}  \]

The upper confidence limit for $\beta _ j$ is computed by starting at the maximum likelihood estimate of $\bbeta $ and iterating with positive values of $\lambda $ until convergence is attained. The process is repeated for the lower confidence limit by using negative values of $\lambda $.

Convergence is controlled by the value $\epsilon $ specified with the PLCONV= option in the MODEL statement (the default value of $\epsilon $ is 1E–4). Convergence is declared on the current iteration if the following two conditions are satisfied:

\[  |l(\bbeta )-l_{0}| \leq \epsilon  \]

and

\[  ({\mb{g}} + \lambda {\mb{e}_ j})’{\bV }^{-1}({\mb{g}} + \lambda {\mb{e}_ j}) \leq \epsilon  \]

Wald Confidence Intervals

Wald confidence intervals are sometimes called the normal confidence intervals. They are based on the asymptotic normality of the parameter estimators. The $100(1-\alpha )$% Wald confidence interval for $\beta _ j$ is given by

\[  {\widehat{\beta }}_ j \pm z_{1-\alpha /2}\widehat{\sigma }_ j  \]

where $z_{p}$ is the 100p percentile of the standard normal distribution, ${\widehat{\beta }}_ j$ is the maximum likelihood estimate of $\beta _ j$, and $\widehat{\sigma }_ j$ is the standard error estimate of ${\widehat{\beta }}_ j$.