A discrete choice experiment is constructed consisting of four product brands, each available at three different prices, $1.49, $1.99, $2.49. In addition, each choice set contains a constant “other” alternative available at $1.49. In the fifth choice set, price is constant. PROC TRANSREG is used to code the design, and the PHREG procedure fits the multinomial logit choice model (not shown). See Kuhfeld (2010) for more information about discrete choice modeling and the multinomial logit model; look for the latest “Discrete Choice” report. The following statements produce Figure 101.76:
title 'Choice Model Coding'; data design; array p[4]; input p1-p4 @@; set = _n_; do brand = 1 to 4; price = p[brand]; output; end; brand = .; price = 1.49; output; /* constant alternative */ keep set brand price; datalines; 1.49 1.99 1.49 1.99 1.99 1.99 2.49 1.49 1.99 1.49 1.99 1.49 1.99 1.49 2.49 1.99 1.49 1.49 1.49 1.49 2.49 1.49 1.99 2.49 1.49 1.49 2.49 2.49 2.49 2.49 1.49 1.49 1.49 2.49 2.49 1.99 2.49 2.49 2.49 1.49 1.99 2.49 1.49 2.49 2.49 1.99 2.49 2.49 2.49 1.49 1.49 1.99 1.49 1.99 1.99 1.49 2.49 1.99 1.99 1.99 1.99 1.99 1.49 2.49 1.99 2.49 1.99 1.99 1.49 2.49 1.99 2.49 ;
proc transreg data=design design norestoremissing nozeroconstant; model class(brand / zero=none) identity(price); output out=coded; by set; run; proc print data=coded(firstobs=21 obs=25); var set brand &_trgind; run;
In the interest of space, only the fifth choice set is displayed in Figure 101.76.
Figure 101.76: The Fifth Choice Set
Choice Model Coding |
Obs | set | brand | brand1 | brand2 | brand3 | brand4 | price |
---|---|---|---|---|---|---|---|
21 | 5 | 1 | 1 | 0 | 0 | 0 | 1.49 |
22 | 5 | 2 | 0 | 1 | 0 | 0 | 1.49 |
23 | 5 | 3 | 0 | 0 | 1 | 0 | 1.49 |
24 | 5 | 4 | 0 | 0 | 0 | 1 | 1.49 |
25 | 5 | . | 0 | 0 | 0 | 0 | 1.49 |
For the constant alternative (Brand
= .), the brand coding is a row of zeros due to the NORESTOREMISSING o-option, and Price
is a constant $1.49 (instead of 0) due to the NOZEROCONSTANT.
The data set was coded by choice set (BY set
;). This is a small problem. With very large problems, it might be necessary to restrict the number of observations that are
coded at one time so that the procedure uses less time and memory. Coding by choice set is one option. When coding is performed
after the data are merged in, coding by subject and choice set combinations is another option. Alternatively, you can specify
DESIGN=n, where n is the number of observations to code at one time. For example, you can specify DESIGN=100 or DESIGN=1000 to process the
data set in blocks of 100 or 1000 observations. Specify the NOZEROCONSTANT a-option to ensure that constant variables within blocks are not zeroed. When you specify DESIGN=n, or perform coding after the data are merged in, specify the dependent variable and any other variables needed for analysis
as ID variables.