The SURVEYLOGISTIC Procedure

Rank Correlation of Observed Responses and Predicted Probabilities

The predicted mean score of an observation is the sum of the Ordered Values (shown in the Response Profile table) minus one, weighted by the corresponding predicted probabilities for that observation; that is, the predicted means score is $\sum _{d=1}^{D+1}(d-1)\hat{\pi }_ d$, where D + 1 is the number of response levels and $\hat{\pi }_ d$ is the predicted probability of the dth (ordered) response.

A pair of observations with different observed responses is said to be concordant if the observation with the lower ordered response value has a lower predicted mean score than the observation with the higher ordered response value. If the observation with the lower ordered response value has a higher predicted mean score than the observation with the higher ordered response value, then the pair is discordant. If the pair is neither concordant nor discordant, it is a tie. Enumeration of the total numbers of concordant and discordant pairs is carried out by categorizing the predicted mean score into intervals of length $D / 500$ and accumulating the corresponding frequencies of observations.

Let N be the sum of observation frequencies in the data. Suppose there are a total of t pairs with different responses, $n_ c$ of them are concordant, $n_ d$ of them are discordant, and $t-n_ c-n_ d$ of them are tied. PROC SURVEYLOGISTIC computes the following four indices of rank correlation for assessing the predictive ability of a model:

$\displaystyle  $
$\displaystyle  $
$\displaystyle \mbox{\Mathtext{c} } =(n_ c+0.5(t-n_ c-n_ d))/t  $
$\displaystyle  $
$\displaystyle  $
$\displaystyle \mbox{Somers \Mathtext{D} } =(n_ c-n_ d)/t  $
$\displaystyle  $
$\displaystyle  $
$\displaystyle \mbox{Goodman-Kruskal Gamma } =(n_ c-n_ d)/(n_ c+n_ d)  $
$\displaystyle  $
$\displaystyle  $
$\displaystyle \mbox{Kendalls Tau-\Mathtext{a} } =(n_ c-n_ d)/(0.5N(N-1))  $

Note that c also gives an estimate of the area under the receiver operating characteristic (ROC) curve when the response is binary (Hanley and McNeil, 1982).

For binary responses, the predicted mean score is equal to the predicted probability for Ordered Value 2. As such, the preceding definition of concordance is consistent with the definition used in previous releases for the binary response model.