PROC CATMOD requires a MODEL statement. You can specify the following in a MODEL statement:
can be either a single variable, a crossed effect with two or more variables joined by asterisks, or _F_. The _F_ specification indicates that the response functions and their estimated covariance matrix are to be read directly into the procedure (see the section Inputting Response Functions and Covariances Directly for details). The response-effect indicates the dependent variables that determine the response categories (the columns of the underlying contingency table).
specify potential sources of variation (such as main effects and interactions) in the model. These effects determine the number of model parameters, as well as the interpretation of such parameters. In addition, if there is no POPULATION statement, PROC CATMOD uses these variables to determine the populations (the rows of the underlying contingency table). When fitting the model, PROC CATMOD adjusts the independent effects in the model for all other independent effects in the model.
Design-effects can be any of those described in the section Specification of Effects, or they can be defined by specifying the actual design matrix, enclosed in parentheses (see the section Specifying the Design Matrix Directly). In addition, you can use the keyword _RESPONSE_ alone or as part of an effect. Effects cannot be nested within _RESPONSE_, so effects of the form A
(_RESPONSE_) are invalid.
For more information, see the section Log-Linear Model Analysis and the section Repeated Measures Analysis.
Some example MODEL statements are shown in the following table:
Example |
Result |
---|---|
|
Main effects only |
|
Main effects with interaction |
|
Nested effect |
|
Complete factorial |
|
Nested-by-value effects |
|
Log-linear model |
|
Nested repeated measurement factor |
|
Direct input of the response functions |
The relationship between these specifications and the structure of the design matrix is described in the section Generation of the Design Matrix.
Table 32.5 summarizes the options available in the MODEL statement.
Table 32.5: MODEL Statement Options
Options |
Task |
---|---|
Specify details of computation |
|
Generates the maximum likelihood estimates |
|
GLS |
Generates the weighted least squares estimates |
Omits the intercept term from the model |
|
Specifies the parameterization of classification variables |
|
Adds a number to each cell frequency |
|
Averages the main effects across response functions |
|
Specifies the convergence criterion for maximum likelihood |
|
Specifies the number of iterations for maximum likelihood |
|
Specifies how missing cells are treated |
|
Specifies how zero cells are treated |
|
Request additional computation and tables |
|
Specifies the significance level of confidence intervals |
|
Displays the Wald confidence intervals of estimates |
|
Displays the estimated correlation matrix of estimates |
|
Displays the covariance matrix of response functions |
|
Displays the estimated covariance matrix of estimates |
|
Displays the design and _RESPONSE_ matrix |
|
Displays the two-way frequency tables |
|
Displays the iterations for maximum likelihood |
|
Displays the one-way frequency tables |
|
Displays the predicted values |
|
PREDICT |
|
Displays the probability estimates |
|
Displays the population profiles |
|
Displays the crossproducts matrix |
|
Specifies the title |
|
Suppress output |
|
Suppresses the design matrix |
|
Suppresses the parameter estimates |
|
Suppresses the variable levels |
|
Suppresses the population and response profiles |
|
Suppresses the _RESPONSE_ matrix |
The following list describes these options in alphabetical order.
If you specify the design matrix directly, adjacent rows of the matrix must be separated by a comma, and the matrix must have rows, where s is the number of populations and q is the number of response functions per population. The first q rows correspond to the response functions for the first population, the second set of q rows corresponds to the functions for the second population, and so forth. The following is an example of using direct specification of the design matrix.
proc catmod; model R=(1 0, 1 1, 1 2, 1 3); run;
These statements are appropriate for the case of one population and for R
with five levels (generating four response functions), so that . These statements are also appropriate for a situation with two populations and two response functions per population, giving
rows of the design matrix. (To accommodate more than one population, the POPULATION
statement is needed.)
When you input the design matrix directly, you also have the option of specifying that any subsets of the parameters be tested for equality to zero. Indicate each subset by specifying the appropriate column numbers of the design matrix, followed by an equal sign and a label (24 characters or less, in quotes) that describes the subset. Adjacent subsets are separated by a comma, and the entire specification is enclosed in parentheses and placed after the design matrix. For example:
proc catmod; population Group Time; model R=(1 1 0 0, 1 1 0 1, 1 1 0 2, 1 0 1 0, 1 0 1 1, 1 0 1 2, 1 -1 -1 0, 1 -1 -1 1, 1 -1 -1 2) (1 ='Intercept', 2 3='Group main effect', 4 ='Linear effect of Time'); run;
The preceding statements are appropriate when Group
and Time
each have three levels and R
is dichotomous. The POPULATION
statement produces nine populations, and q = 1 (since R
is dichotomous), so .
If you input the design matrix directly but do not specify any subsets of the parameters to be tested, then PROC CATMOD tests the effect of MODEL | MEAN, which represents the significance of the model beyond what is explained by an overall mean. For the previous example, the MODEL | MEAN effect is the same as that obtained by specifying the following at the end of the MODEL statement:
(2 3 4='model|mean');