A more general model is the probit model, which is derived under the assumption of jointly normal random utility components, where the error vector has a multivariate normal distribution with a mean vector of and a covariance matrix . For a full covariance matrix, PROC BCHOICE can accommodate any pattern of correlation and heteroscedasticity. Thus, this model fits a very general error structure. For more information about probit models, see the section Probit.
In Train (2009), a project team collected the following data from commuters about four available travel modes for their trips to work: car alone, carpool, bus, and railway. The time and cost of travel for each mode were determined for each commuter, based on the location of the commuter’s home and workplace.
data Commuter; input Subject Mode Choice Cost Time @@; datalines; 1 1 1 1.51 18.5 1 2 0 2.34 26.34 1 3 0 1.8 20.87 1 4 0 2.36 30.03 2 1 0 6.06 31.31 2 2 0 2.9 34.26 2 3 0 2.24 67.18 2 4 1 1.86 60.29 3 1 1 5.79 22.55 3 2 0 2.14 23.26 3 3 0 2.58 63.31 3 4 0 2.75 49.17 4 1 1 1.87 26.09 4 2 0 2.57 29.9 4 3 0 1.9 19.75 4 4 0 2.27 13.47 5 1 1 2.5 4.7 5 2 0 1.72 12.41 5 3 0 2.69 43.09 5 4 0 2.97 39.74 6 1 1 4.73 3.07 6 2 0 0.62 9.22 6 3 0 1.85 12.83 6 4 0 2.31 43.54 7 1 1 4.73 13.14 7 2 0 0.6 17.77 7 3 0 2.43 54.09 7 4 0 2 42.22 8 1 1 5.35 52.9 8 2 0 2.91 48.78 8 3 0 2.61 69.16 8 4 0 2.78 53.25 9 1 0 4.41 61.06 9 2 0 1.59 62.13 9 3 1 ... more lines ... 450 1 1 4.59 29.44 450 2 0 2.89 33.73 450 3 0 1.9 66.12 450 4 0 1.79 39.84 451 1 1 3.24 16.35 451 2 0 1.21 18.98 451 3 0 1.75 23.39 451 4 0 2.02 43.3 452 1 0 6.93 65.42 452 2 0 1.17 60.48 452 3 1 2.46 52.4 452 4 0 2.61 48.37 453 1 0 6.53 59.57 453 2 1 1.41 55.14 453 3 0 2.21 67.82 453 4 0 1.86 73.45 ;
proc print data=Commuter(obs=20); run;
The variable Mode
has the value 1 for car alone, 2 for carpool, 3 for bus, and 4 for railway. The variable Choice
is the response variable that represents the decision among the four travel modes for each commuter. The data for the first
five commuters are shown in Output 27.3.1.
Output 27.3.1: Data for the First Five Commuters
Obs | Subject | Mode | Choice | Cost | Time |
---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1.51 | 18.50 |
2 | 1 | 2 | 0 | 2.34 | 26.34 |
3 | 1 | 3 | 0 | 1.80 | 20.87 |
4 | 1 | 4 | 0 | 2.36 | 30.03 |
5 | 2 | 1 | 0 | 6.06 | 31.31 |
6 | 2 | 2 | 0 | 2.90 | 34.26 |
7 | 2 | 3 | 0 | 2.24 | 67.18 |
8 | 2 | 4 | 1 | 1.86 | 60.29 |
9 | 3 | 1 | 1 | 5.79 | 22.55 |
10 | 3 | 2 | 0 | 2.14 | 23.26 |
11 | 3 | 3 | 0 | 2.58 | 63.31 |
12 | 3 | 4 | 0 | 2.75 | 49.17 |
13 | 4 | 1 | 1 | 1.87 | 26.09 |
14 | 4 | 2 | 0 | 2.57 | 29.90 |
15 | 4 | 3 | 0 | 1.90 | 19.75 |
16 | 4 | 4 | 0 | 2.27 | 13.47 |
17 | 5 | 1 | 1 | 2.50 | 4.70 |
18 | 5 | 2 | 0 | 1.72 | 12.41 |
19 | 5 | 3 | 0 | 2.69 | 43.09 |
20 | 5 | 4 | 0 | 2.97 | 39.74 |
The following statements fit a probit model by specifying TYPE=PROBIT. The BCHOICE procedure’s implementation of the Gibbs sampler for the probit model exhibits a higher autocorrelation than that for the logit model. High autocorrelation is created by introducing the latent variable via data augmentation because of the dependence between the latent variable and the regression parameters. You might want to control the thinning rate of the simulation. For example, THIN=10 keeps every 10th sample in the simulation and discards the rest.
proc bchoice data=Commuter outpost=Commupostsamp thin=10 nmc=50000 seed=123; class Mode(ref='1') Subject; model Choice = Cost Time Mode / choiceset=(Subject) type=probit; run;
Output 27.3.2 shows the summary statistics for the part-worth () of each of the attributes (Cost
, Time
, Mode 2
, Mode 3
, and Mode 4
) and the covariance of the error difference vector (), which is displayed by parameters labeled “Sigma 1 1,” “Sigma 2 1,” and so on.
Output 27.3.2: Posterior Summary Statistics
Posterior Summaries and Intervals | |||||
---|---|---|---|---|---|
Parameter | N | Mean | Standard Deviation |
95% HPD Interval | |
Cost | 5000 | -0.4641 | 0.0736 | -0.6152 | -0.3299 |
Time | 5000 | -0.0494 | 0.00567 | -0.0603 | -0.0377 |
Mode 2 | 5000 | -3.3847 | 0.6746 | -4.8022 | -2.1983 |
Mode 3 | 5000 | -2.0056 | 0.2853 | -2.5709 | -1.4590 |
Mode 4 | 5000 | -1.6277 | 0.2085 | -2.0489 | -1.2382 |
Sigma 1 1 | 5000 | 1.0000 | 0 | 1.0000 | 1.0000 |
Sigma 2 1 | 5000 | 1.1695 | 0.5505 | 0.0179 | 2.2003 |
Sigma 2 2 | 5000 | 4.3809 | 1.9423 | 1.3685 | 8.3868 |
Sigma 3 1 | 5000 | 0.5700 | 0.2203 | 0.1429 | 1.0094 |
Sigma 3 2 | 5000 | 1.6328 | 0.9052 | -0.00171 | 3.6863 |
Sigma 3 3 | 5000 | 1.3917 | 0.4795 | 0.5699 | 2.3242 |
It is well known that an identification problem exists in probit models, because location and scale transformations do not change the choices that are made. The solution to the location shift is differencing with respect to the last alternative in each choice set. (See the section Probit.) After that, a scale shift problem remains, because the parameters for any constant are equivalent to . A solution to the scaling problem is to normalize the parameters with respect to one of the diagonal elements of the covariance of the error difference vector, . PROC BCHOICE reports at each draw, where is the first diagonal entry of . This explains why “Sigma 1 1” is always 1 in Output 27.3.2.
By the IIA property in a logit model, it is assumed that all alternatives are independent and have the same variance. Therefore, the normalized covariance matrix after differencing with respect to one of the alternatives is of the form
Obviously, this matrix is quite different from the estimated normalized covariance matrix for this data set. Fitting a standard logit model would be inappropriate.