The GENMOD Procedure

Parameterization Used in PROC GENMOD

Design Matrix
Missing Level Combinations

Design Matrix

The linear predictor part of a generalized linear model is

$\bm {\eta } = \mb {X} \bbeta$

where $\bbeta$ is an unknown parameter vector and $\mb {X}$ is a known design matrix. By default, all models automatically contain an intercept term; that is, the first column of $\mb {X}$ contains all 1s. Additional columns of $\mb {X}$ are generated for classification variables, regression variables, and any interaction terms included in the model. It is important to understand the ordering of classification variable parameters when you use the ESTIMATE or CONTRAST statement. The ordering of these parameters is displayed in the “CLASS Level Information” table and in tables displaying the parameter estimates of the fitted model.

When you specify an overparameterized model with the PARAM=GLM option in the CLASS statement, some columns of $\mb {X}$ can be linearly dependent on other columns. For example, when you specify a model consisting of an intercept term and a classification variable, the column corresponding to any one of the levels of the classification variable is linearly dependent on the other columns of $\mb {X}$ . The columns of $\mb {X}^{\prime }\mb {X}$ are checked in the order in which the model is specified for dependence on preceding columns. If a dependency is found, the parameter corresponding to the dependent column is set to 0 along with its standard error to indicate that it is not estimated. The order in which the levels of a classification variable are checked for dependencies can be set by the ORDER= option in the PROC GENMOD statement or by the ORDER= option in the CLASS statement. For full-rank parameterizations, the columns of the $\mb {X}$ matrix are designed to be linearly independent.

You can exclude the intercept term from the model by specifying the NOINT option in the MODEL statement.

Missing Level Combinations

All levels of interaction terms involving classification variables might not be represented in the data. In that case, PROC GENMOD does not include parameters in the model for the missing levels.