The PREDDIST statement creates a new SAS data set that contains random samples from the posterior predictive distribution of the response variable. The posterior predictive distribution is the distribution of unobserved observations (prediction) conditional on the observed data. Let be the observed data, be the covariates, be the parameter, and be the unobserved data. The posterior predictive distribution is defined to be the following:
Given the assumption that the observed and unobserved data are conditional independent given , the posterior predictive distribution can be further simplified as the following:
The posterior predictive distribution is an integral of the likelihood function with respect to the posterior distribution . The PREDDIST statement generates samples from a posterior predictive distribution based on draws from the posterior distribution of .
The PREDDIST statement works only on response variables that have standard distributions, and it does not support either the GENERAL or DGENERAL functions. Multiple PREDDIST statements can be specified, and an optional label (specified as a quoted string) helps identify the output.
The following list explains specifications in the PREDDIST statement:
For an example that uses the PREDDIST statement, see Posterior Predictive Distribution.