In this example, data are prepared for use by the MDCDATA statement. Sometimes, choice-specific information is stored in multiple variables. Since the MDC procedure requires multiple observations for each decision maker, you need to arrange the data so that there is an observation for each subject-alternative (individual-choice) combination. Simple binary choice data are obtained from Ben-Akiva and Lerman (1985). The following statements create the SAS data set:
data travel; length mode $ 8; input auto transit mode $; datalines; 52.9 4.4 Transit 4.1 28.5 Transit 4.1 86.9 Auto 56.2 31.6 Transit 51.8 20.2 Transit 0.2 91.2 Auto 27.6 79.7 Auto 89.9 2.2 Transit 41.5 24.5 Transit 95.0 43.5 Transit 99.1 8.4 Transit ... more lines ...
The travel time is stored in two variables, auto
and transit
. In addition, the chosen alternatives are stored in a character variable, mode
. The choice variable, mode
, is converted to a numeric variable, decision
, since the MDC procedure supports only numeric variables. The following statements convert the original data set, travel
, and estimate the binary logit model. The first 10 observations of a relevant subset of the new data set and the parameter
estimates are displayed in Output 18.2.1 and Output 18.2.2, respectively.
data new; set travel; retain id 0; id+1; /*-- create auto variable --*/ decision = (upcase(mode) = 'AUTO'); ttime = auto; autodum = 1; trandum = 0; output; /*-- create transit variable --*/ decision = (upcase(mode) = 'TRANSIT'); ttime = transit; autodum = 0; trandum = 1; output; run;
proc print data=new(obs=10); var decision autodum trandum ttime; id id; run;
Output 18.2.1: Converted Data
id | decision | autodum | trandum | ttime |
---|---|---|---|---|
1 | 0 | 1 | 0 | 52.9 |
1 | 1 | 0 | 1 | 4.4 |
2 | 0 | 1 | 0 | 4.1 |
2 | 1 | 0 | 1 | 28.5 |
3 | 1 | 1 | 0 | 4.1 |
3 | 0 | 0 | 1 | 86.9 |
4 | 0 | 1 | 0 | 56.2 |
4 | 1 | 0 | 1 | 31.6 |
5 | 0 | 1 | 0 | 51.8 |
5 | 1 | 0 | 1 | 20.2 |
The following statements perform the binary logit estimation:
proc mdc data=new; model decision = autodum ttime / type=clogit nchoice=2; id id; run;
Output 18.2.2: Binary Logit Estimation of Modal Choice Data
Parameter Estimates | |||||
---|---|---|---|---|---|
Parameter | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
autodum | 1 | -0.2376 | 0.7505 | -0.32 | 0.7516 |
ttime | 1 | -0.0531 | 0.0206 | -2.57 | 0.0101 |
In order to handle more general cases, you can use the MDCDATA statement. Choice-specific dummy variables are generated and
multiple observations for each individual are created. The following example converts the original data set travel
by using the MDCDATA statement and performs conditional logit analysis. Interleaved data are output into the new data set
new3
. This data set has twice as many observations as the original travel
data set.
proc mdc data=travel; mdcdata varlist( x1 = (auto transit) ) select=mode id=id alt=alternative decvar=Decision / out=new3; model decision = auto x1 / nchoice=2 type=clogit; id id; run;
The first nine observations of the modified data set are shown in Output 18.2.3. The result of the preceding program is listed in Output 18.2.4.
Output 18.2.3: Transformed Model Choice Data
Obs | MODE | AUTO | TRANSIT | X1 | ID | ALTERNATIVE | DECISION |
---|---|---|---|---|---|---|---|
1 | TRANSIT | 1 | 0 | 52.9 | 1 | 1 | 0 |
2 | TRANSIT | 0 | 1 | 4.4 | 1 | 2 | 1 |
3 | TRANSIT | 1 | 0 | 4.1 | 2 | 1 | 0 |
4 | TRANSIT | 0 | 1 | 28.5 | 2 | 2 | 1 |
5 | AUTO | 1 | 0 | 4.1 | 3 | 1 | 1 |
6 | AUTO | 0 | 1 | 86.9 | 3 | 2 | 0 |
7 | TRANSIT | 1 | 0 | 56.2 | 4 | 1 | 0 |
8 | TRANSIT | 0 | 1 | 31.6 | 4 | 2 | 1 |
9 | TRANSIT | 1 | 0 | 51.8 | 5 | 1 | 0 |
Output 18.2.4: Results Using MDCDATA Statement
Parameter Estimates | |||||
---|---|---|---|---|---|
Parameter | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
AUTO | 1 | -0.2376 | 0.7505 | -0.32 | 0.7516 |
X1 | 1 | -0.0531 | 0.0206 | -2.57 | 0.0101 |