The OUTPUT statement creates a new SAS data set that saves diagnostic measures that are calculated for the selected model. If you do not specify a keyword, then the only diagnostic included is the predicted response.
All the variables in the original data set are included in the new data set, along with variables that are created by the keyword options in the OUTPUT statement. These new variables contain the values of a variety of statistics and diagnostic measures that are calculated for each observation in the data set.
The OUTPUT data set is created in row-wise form, and the variable _QUANTILE_
is optional. For each appropriate keyword specified in the OUTPUT statement, one variable for each specified quantile level is generated. These variables appear in
the sorted order of the specified quantile levels.
If you specify a BY statement, then a variable _BY_
that indexes the BY groups is included. For each observation, the value of _BY_
is the index of the BY group to which this observation belongs. This variable is useful for matching BY groups with macro
variables that PROC QUANTSELECT creates. See the section Macro Variables That Contain Selected Models for more information.
If you have partitioned the input data with a PARTITION
statement, then a character variable _ROLE_
is included in the output data set. The following table shows the value of _ROLE_
for each observation:
|
Observation Role |
---|---|
TEST |
Testing |
TRAIN |
Training |
VALIDATE |
Validation |
If you want to create a permanent SAS data set, you must specify a two-level name. For more information about permanent SAS data sets, see see the discussion in SAS Language Reference: Concepts.
You can specify the following arguments in the OUTPUT statement: