This example uses the regression method to impute missing values for all variables in a data set with a monotone missing pattern.
The following statements invoke the MI procedure and request the regression method for the variable Length2
and the predictive mean matching method for variable Length3
. The resulting data set is named Outex3
.
proc mi data=Fish1 round=.1 mu0= 0 35 45 seed=13951639 out=outex3; monotone reg(Length2/ details) regpmm(Length3= Length1 Length2 Length1*Length2/ details); var Length1 Length2 Length3; run;
The ROUND= option is used to round the imputed values to the same precision as observed values. The values specified with
the ROUND= option are matched with the variables Length1
, Length2
, and Length3
in the order listed in the VAR statement. The MU0= option requests t tests for the hypotheses that the population means corresponding to the variables in the VAR statement are Length2
=35 and Length3
=45.
The "Missing Data Patterns" table lists distinct missing data patterns with corresponding frequencies and percentages. It is identical to the table in Output 63.2.3 in Example 63.2.
The "Monotone Model Specification" table in Output 63.3.1 displays the model specification.
When you use the DETAILS option, the parameters estimated from the observed data and the parameters used in each imputation are displayed in Output 63.3.2 and Output 63.3.3.
Output 63.3.3: Regression Predicted Mean Matching Model
Regression Models for Monotone Predicted Mean Matching Method | |||||||
---|---|---|---|---|---|---|---|
Imputed Variable |
Effect | Obs Data | Imputation | ||||
1 | 2 | 3 | 4 | 5 | |||
Length3 | Intercept | -0.01304 | 0.004134 | -0.011417 | -0.034177 | -0.010532 | 0.004685 |
Length3 | Length1 | -0.01332 | 0.025320 | -0.037494 | 0.308765 | 0.156606 | -0.147118 |
Length3 | Length2 | 0.98918 | 0.955510 | 1.025741 | 0.673374 | 0.828384 | 1.146440 |
Length3 | Length1*Length2 | -0.02521 | -0.034964 | -0.022017 | -0.017919 | -0.029335 | -0.034671 |
After the completion of five imputations by default, the "Variance Information" table in Output 63.3.4 displays the between-imputation variance, within-imputation variance, and total variance for combining complete-data inferences. The relative increase in variance due to missingness, the fraction of missing information, and the relative efficiency for each variable are also displayed. These statistics are described in the section Combining Inferences from Multiply Imputed Data Sets.
The "Parameter Estimates" table in Output 63.3.5 displays a 95% mean confidence interval and a t statistic with its associated p-value for each of the hypotheses requested with the MU0= option.
Output 63.3.5: Parameter Estimates
Parameter Estimates | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Variable | Mean | Std Error | 95% Confidence Limits | DF | Minimum | Maximum | Mu0 | t for H0: Mean=Mu0 |
Pr > |t| | |
Length2 | 33.104571 | 0.663078 | 31.75417 | 34.45497 | 32.15 | 33.088571 | 33.117143 | 35.000000 | -2.86 | 0.0074 |
Length3 | 38.424571 | 0.698123 | 37.00277 | 39.84637 | 32.131 | 38.397143 | 38.445714 | 45.000000 | -9.42 | <.0001 |
The following statements list the first 10 observations of the data set Outex3
in Output 63.3.6. Note that the imputed values of Length2
are rounded to the same precision as the observed values.
proc print data=outex3(obs=10); title 'First 10 Observations of the Imputed Data Set'; run;