PROC SEVERITY computes and reports various statistics of fit to indicate how well the estimated model fits the data. The statistics belong to two categories: likelihood-based statistics and EDF-based statistics. Statistics Neg2LogLike, AIC, AICC, and BIC are likelihood-based statistics, and statistics KS, AD, and CvM are EDF-based statistics. The following subsections provide definitions of each.
Let denote the response variable values. Let
be the likelihood as defined in the section Likelihood Function. Let
denote the number of model parameters estimated. Note that
, where
is the number of distribution parameters,
is the number of regressors, if any, specified in the SCALEMODEL statement, and
is the number of regressors found to be linearly dependent (redundant) on other regressors. Given this notation, the likelihood-based
statistics are defined as follows:
The log likelihood is reported as
![]() |
The multiplying factor makes it easy to compare it to the other likelihood-based statistics. A model with a smaller value of Neg2LogLike is deemed
better.
The Akaike’s information criterion (AIC) is defined as
![]() |
A model with a smaller value of AIC is deemed better.
The corrected Akaike’s information criterion (AICC) is defined as
![]() |
A model with a smaller value of AICC is deemed better. It corrects the finite-sample bias that AIC has when is small compared to
. AICC is related to AIC as
![]() |
As becomes large compared to
, AICC converges to AIC. AICC is usually recommended over AIC as a model selection criterion.
The Schwarz Bayesian information criterion (BIC) is defined as
![]() |
A model with a smaller value of BIC is deemed better.
This class of statistics is based on the difference between the estimate of the cumulative distribution function (CDF) and
the estimate of the empirical distribution function (EDF). Let denote the sample of
values of the response variable. Let
denote the number of observations with a value less than or equal to
, where
is an indicator function. Let
denote the EDF estimate that is computed by using the method specified in the EMPIRICALCDF= option. Let
denote the estimate of the CDF. Let
denote the EDF estimate of
values that are computed using the same method that is used to compute the EDF of
values. Using the probability integral transformation, if
is the true distribution of the random variable
, then the random variable
is uniformly distributed between 0 and 1 (D’Agostino and Stephens 1986, Ch. 4). Thus, comparing
with
is equivalent to comparing
with
(uniform distribution).
Note the following two points regarding which CDF estimates are used for computing the test statistics:
If regressor variables are specified, then the CDF estimates used for computing the EDF test statistics are from a mixture distribution. See the section CDF and PDF Estimates with Regression Effects for more information.
If the EDF estimates are conditional because of the truncation information, then each unconditional estimate is converted to a conditional estimate using the method described in the section Truncation and Conditional CDF Estimates.
In the following, it is assumed that denotes an appropriate estimate of the CDF if truncation or regression effects are specified. Given this, the EDF-based statistics
of fit are defined as follows:
The Kolmogorov-Smirnov (KS) statistic computes the largest vertical distance between the CDF and the EDF. It is formally defined as follows:
![]() |
If the STANDARD method is used to compute the EDF, then the following formula is used:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Note that is assumed to be 0.
If the method used to compute the EDF is any method other than the STANDARD method, then the following formula is used:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The Anderson-Darling (AD) statistic is a quadratic EDF statistic that is proportional to the expected value of the weighted squared difference between the EDF and CDF. It is formally defined as follows:
![]() |
If the STANDARD method is used to compute the EDF, then the following formula is used:
![]() |
If the method used to compute the EDF is any method other than the STANDARD method, then the statistic can be computed by using the following two pieces of information:
If the EDF estimates are computed using the KAPLANMEIER or MODIFIEDKM methods, then EDF is a step function such that the estimate
is a constant equal to
in interval
. If the EDF estimates are computed using the TURNBULL method, then there are two types of intervals: one in which the EDF
curve is constant and the other in which the EDF curve is theoretically undefined. For computational purposes, it is assumed
that the EDF curve is linear for the latter type of the interval. For each method, the EDF estimate
at
can be written as
![]() |
where is the slope of the line defined as
![]() |
For the KAPLANMEIER or MODIFIEDKM method, in each interval.
Using the probability integral transform , the formula simplifies to
![]() |
The computation formula can then be derived from the following approximation:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where ,
, and
is the number of points at which the EDF estimate are computed. For the TURNBULL method,
for some
.
Assuming ,
,
, and
yields the following computation formula:
![]() |
where ,
, and
.
If EDF estimates are computed using the KAPLANMEIER or MODIFIEDKM method, then and
, which simplifies the formula as
![]() |
The Cramér-von Mises (CvM) statistic is a quadratic EDF statistic that is proportional to the expected value of the squared difference between the EDF and CDF. It is formally defined as follows:
![]() |
If the STANDARD method is used to compute the EDF, then the following formula is used:
![]() |
If the method used to compute the EDF is any method other than the STANDARD method, then the statistic can be computed by using the following two pieces of information:
As described previously for the AD statistic, the EDF estimates are assumed to be piecewise linear such that the estimate
at
is
![]() |
where is the slope of the line defined as
![]() |
For the KAPLANMEIER or MODIFIEDKM method, in each interval.
Using the probability integral transform , the formula simplifies to:
![]() |
The computation formula can then be derived from the following approximation:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where ,
, and
is the number of points at which the EDF estimate are computed. For the TURNBULL method,
for some
.
Assuming ,
, and
yields the following computation formula:
![]() |
where ,
, and
.
If EDF estimates are computed using the KAPLANMEIER or MODIFIEDKM method, then and
, which simplifies the formula as
![]() |
which is similar to the formula proposed by Koziol and Green (1976).