The Poisson regression model can be generalized by introducing an unobserved heterogeneity term for observation . Thus, the individuals are assumed to differ randomly in a manner that is not fully accounted for by the observed covariates. This is formulated as
|
where the unobserved heterogeneity term is independent of the vector of regressors . Then the distribution of conditional on and is Poisson with conditional mean and conditional variance :
|
Let be the probability density function of . Then, the distribution (no longer conditional on ) is obtained by integrating with respect to :
|
An analytical solution to this integral exists when is assumed to follow a gamma distribution. This solution is the negative binomial distribution. When the model contains a constant term, it is necessary to assume that , in order to identify the mean of the distribution. Thus, it is assumed that follows a gamma() distribution with and ,
|
where is the gamma function and is a positive parameter. Then, the density of given is derived as
|
|
|
|
|
|
|
|
|
|
|
|
Making the substitution (), the negative binomial distribution can then be rewritten as
|
Thus, the negative binomial distribution is derived as a gamma mixture of Poisson random variables. It has conditional mean
|
and conditional variance
|
The conditional variance of the negative binomial distribution exceeds the conditional mean. Overdispersion results from neglected unobserved heterogeneity. The negative binomial model with variance function , which is quadratic in the mean, is referred to as the NEGBIN2 model (Cameron and Trivedi 1986). To estimate this model, specify DIST=NEGBIN(p=2) in the MODEL statement. The Poisson distribution is a special case of the negative binomial distribution where . A test of the Poisson distribution can be carried out by testing the hypothesis that . A Wald test of this hypothesis is provided (it is the reported statistic for the estimated in the negative binomial model).
The log-likelihood function of the negative binomial regression model (NEGBIN2) is given by
|
|
|
|
|
|
|
if is an integer. See Poisson Regression for the definition of .
The gradient is
|
and
|
Cameron and Trivedi (1986) consider a general class of negative binomial models with mean and variance function . The NEGBIN2 model, with , is the standard formulation of the negative binomial model. Models with other values of , , have the same density except that is replaced everywhere by . The negative binomial model NEGBIN1, which sets , has variance function , which is linear in the mean. To estimate this model, specify DIST=NEGBIN(p=1) in the MODEL statement.
The log-likelihood function of the NEGBIN1 regression model is given by
|
|
|
|
|
|
See the section Poisson Regression for the definition of .
The gradient is
|
and
|