Discrete choice logit models fall in the framework of a generalized linear model (GLM) with a logit link. The Metropolis-Hastings sampling approach of Gamerman (1997) is well suited to this type of model.
In the GLM setting, the data are assumed to be independent with exponential family density
The means that are related to the canonical parameters via and to the regression coefficients via the link function
The maximum likelihood (ML) estimator in a GLM and the asymptotic variance are obtained by iterative application of weighted least squares (IWLS) to transformed observations. Following McCullagh and Nelder (1989), define the transformed response as
and define the corresponding weights as
Suppose a normal prior is specified on , . The posterior density is as follows:
Gamerman (1997) proposes that Metropolis-Hastings sampling be combined with iterative weighted least squares as follows:
Start with and .
Sample from the proposal density , where
Accept with probability
where is the posterior density and and are the transitional probabilities that are based on the proposal density . More specifically, is an density that is evaluated at , whereas and have the same expression as and but depend on instead of . If is not accepted, the chain stays with .
Set and return to step 1.
You can extend this methodology to logit models that have random effects. If there are random effects, the link function is extended to
where the random effects are assumed to have a normal distribution, , and . The posterior density is
The parameters are divided into blocks, , and . For the fixed-effects block, the conditional posterior has the same form, but the link changes to include , which are taken as known constants (offsets) at each iteration. The only change that is needed is to replace the transformed response with in step 2 of the previous Gamerman procedure.
For the random-effects block, the same Metropolis-Hastings sampling with the least square proposal can apply. The conditional posterior is
The transformed response is now , and the proposal density is , where
Finally, for the covariance matrix block, direct sampling from an is used, where .
The chain is initialized with random effects set to 0 and the covariance set to the identity matrix. Updating is done first for the fixed effects, , as a block to position the chain in the correct region of the parameter space. Then the random effects are updated, and finally the covariance of the random effects is updated. For more information about this algorithm, see Gamerman (1997).