Example 18.3 Correlated Choice Modeling

Often, it is not realistic to assume that the random components of utility for all choices are independent. This example shows the solution to the problem of correlated random components by using multinomial probit and nested logit.

To analyze correlated data, trinomial choice data (1,000 observations) are created using a pseudo-random number generator by using the following statements. The random utility function is

\[  U_{ij} = V_{ij} + \epsilon _{ij},\; \; j=1,2,3  \]

where

\[  \epsilon _{ij}\sim N\left(0,\left[ \begin{array}{ccc} 2 &  .6 &  0 \\ .6 &  1 &  0 \\ 0 &  0 &  1 \\ \end{array} \right] \right)  \]
/*-- generate simulated series --*/
%let ndim = 3;
%let nobs = 1000;

data trichoice;
   array error{&ndim} e1-e3;
   array vtemp{&ndim} _temporary_;
   array lm{6} _temporary_ (1.4142136 0.4242641 1 0 0 1);
   retain nseed 345678;

   do id = 1 to &nobs;
      index = 0;
      /* generate independent normal variate */
      do i = 1 to &ndim;
         /* index of diagonal element */
         vtemp{i} = rannor(nseed);
      end;
      /* get multivariate normal variate */
      index = 0;
      do i = 1 to &ndim;
         error{i} = 0;
         do j = 1 to i;
            error{i} = error{i} + lm{index+j}*vtemp{j};
         end;
         index = index + i;
      end;
      x1 = 1.0 + 2.0 * ranuni(nseed);
      x2 = 1.2 + 2.0 * ranuni(nseed);
      x3 = 1.5 + 1.2 * ranuni(nseed);
      util1 = 2.0 * x1 + e1;
      util2 = 2.0 * x2 + e2;
      util3 = 2.0 * x3 + e3;
      do i = 1 to &ndim;
         vtemp{i} = 0;
      end;
      if ( util1 > util2 & util1 > util3 ) then
         vtemp{1} = 1;
      else if ( util2 > util1 & util2 > util3 ) then
         vtemp{2} = 1;
      else if ( util3 > util1 & util3 > util2 ) then
         vtemp{3} = 1;
      else continue;
      /*-- first choice --*/
      x = x1;
      mode = 1;
      decision = vtemp{1};
      output;
      /*-- second choice --*/
      x = x2;
      mode = 2;
      decision = vtemp{2};
      output;
      /*-- third choice --*/
      x = x3;
      mode = 3;
      decision = vtemp{3};
      output;
   end;
run;

First, the multinomial probit model is estimated (see the following statements). Results show that the standard deviation, correlation, and slope estimates are close to the parameter values. Note that $\rho _{12} = \frac{\sigma _{12}}{\sqrt {(\sigma _1^2)(\sigma _2^2)}} = \frac{0.6}{\sqrt {(2)(1)}}=0.42$, $\sigma _1 = \sqrt {2}=1.41$, $\sigma _2 = \sqrt {1}=1$, and the parameter value for the variable x is 2.0. (See Output 18.3.1.)

/*-- Trinomial Probit --*/
proc mdc data=trichoice randnum=halton nsimul=100;
   model decision = x /
            type=mprobit
            choice=(mode 1 2 3)
            covest=op
            optmethod=qn;
   id id;
run;

Output 18.3.1: Trinomial Probit Model Estimation

The MDC Procedure
 
Multinomial Probit Estimates

Parameter Estimates
Parameter DF Estimate Standard
Error
t Value Approx
Pr > |t|
x 1 1.7685 0.1191 14.85 <.0001
STD_1 1 1.2514 0.1494 8.38 <.0001
RHO_21 1 0.3971 0.1087 3.65 0.0003


Output 18.3.2 shows a two-level decision tree.

Output 18.3.2: Nested Tree Structure


The following statements estimate the nested model shown in Output 18.3.2:

/*-- Two-Level Nested Logit --*/
proc mdc data=trichoice;
   model decision = x /
            type=nlogit
            choice=(mode 1 2 3)
            covest=op
            optmethod=qn;
   id id;
   utility u(1,) = x;
   nest level(1) = (1 2 @ 1, 3 @ 2),
        level(2) = (1 2 @ 1);
run;

The estimated result (see Output 18.3.3) shows that the data support the nested tree model since the estimates of the inclusive value parameters are significant and are less than 1.

Output 18.3.3: Two-Level Nested Logit

The MDC Procedure
 
Nested Logit Estimates

Parameter Estimates
Parameter DF Estimate Standard
Error
t Value Approx
Pr > |t|
x_L1 1 2.5907 0.1958 13.23 <.0001
INC_L2G1C1 1 0.8103 0.0859 9.43 <.0001
INC_L2G1C2 1 0.8189 0.0955 8.57 <.0001