The PROC RSREG statement invokes the RSREG procedure. Table 87.1 summarizes the options available in the PROC RSREG statement.
The following list describes these options.
specifies the input SAS data set that contains the data to be analyzed. By default, PROC RSREG uses the most recently created SAS data set.
suppresses the normal display of results when only the output data set is required.
For more information, see the description of the NOPRINT option in the MODEL and RIDGE statements.
Note that this option temporarily disables the Output Delivery System (ODS); see Chapter 20: Using the Output Delivery System, for more information.
creates an output SAS data set that contains statistics for each observation in the input data set. In particular, this data set contains the BY variables, the ID variables, the WEIGHT variable, the variables in the MODEL statement, and the output options requested in the MODEL statement. You must specify output statistic options in the MODEL statement; otherwise, the output data set is created but contains no observations. If you want to create a SAS data set in a permanent library, you must specify a two-level name. For more information about permanent libraries and SAS data sets, see SAS Language Reference: Concepts. For more details, see the section OUT=SAS-data-set.
controls the plots produced through ODS Graphics. When you specify only one plot-request, you can omit the parentheses from around the plot-request. For example:
plots = all plots = (diagnostics ridge surface(unpack)) plots(unpack) = surface(overlaypairs)
ODS Graphics must be enabled before plots can be requested. For example:
ods graphics on; proc rsreg plots=all; model y=x; run; ods graphics off;
For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in Chapter 21: Statistical Graphics Using ODS.
By default, no graphs are created; you must specify the PLOTS= option to make graphs. See Figure 87.4, Output 87.1.5, Output 87.2.3, and Output 87.2.4 for examples of the ODS graphical displays.
The following global-plot-option is available.
The following plot-requests are available.
produces all appropriate plots. You can specify other options with ALL; for example, to display all plots and unpack the SURFACE
contours you can specify plots=(all surface(unpack))
.
displays a panel of summary fit diagnostic plots. The plots produced and their usage are discussed in Table 87.2.
Table 87.2: Diagnostic Plots
Diagnostic Plot |
Usage |
---|---|
Cook’s D statistic versus observation number |
Evaluate influence of an observation on the entire parameter estimate vector |
Dependent variable values versus predicted values |
Evaluate adequacy of fit and detect influential observations |
Externally studentized residuals (RStudent) versus leverage |
Detect outliers and influential (high-leverage) observations |
Externally studentized residuals versus predicted values |
Evaluate adequacy of fit and detect outliers |
Histogram of residuals |
Confirm normality of error terms |
Normal quantile plot of residuals |
Confirm normality and homogeneity of error terms, and detect outliers |
Residuals versus predicted values |
Evaluate adequacy of fit and detect outliers |
Residual-fit (RF) spread plot |
side-by-side quantile plots of the centered fit and the residuals show "how much variation in the data is explained by the fit and how much remains in the residuals" (Cleveland, 1993) |
Observations satisfying RStudent > 2 or RStudent < –2 are called outliers, and observations with leverage > 2p/n are called influential, where n is the number of observations used in fitting the model and p is the number of parameters used in the model (Rawlings, Pantula, and Dickey, 1998). Specifying the LABEL option labels the influential and outlying observations—the label is the first ID variable if the ID statement is specified; otherwise, it is the observation number. Note in the Cook’s D plot that only observations with D exceeding 4/n are labeled; these are also called influential observations. The UNPACK option displays each diagnostic plot separately. See Output 87.2.3 for an example of the diagnostics panel.
plots the predicted values against a single predictor when you have only one factor or only one covariate in the model. The GRIDSIZE= option specifies the number of points at which the fitted values are computed; by default, GRIDSIZE=200.
suppresses all plots.
displays plots of residuals against each factor and covariate. The UNPACK option displays each residual plot separately. The SMOOTH option overlays a loess smooth on each residual plot; see Chapter 59: The LOESS Procedure, for more information. See Output 87.2.4 for an example of this plot.
displays the maximum and/or minimum ridge plots. This option is available only when a MAXIMUM or MINIMUM option is specified in the RIDGE statement. The UNPACK option displays the estimated response and factor level ridge plots separately. See Output 87.1.5 for an example of this plot.
displays the response surface for each response variable and each pair of factors with all other factors and covariates fixed at their means. By default a panel of contour plots is produced; see Output 87.1.5 for an example of this plot. The following surface-options can be specified:
displays three-dimensional surface plots instead of contour plots. See Figure 87.4 for an example of this plot.
specifies fixed values for factors and covariates. You can specify one or more numbers in the value-list or one of the following keywords:
MIN |
sets the variable to its minimum value. |
MEAN |
sets the variable to its mean value. |
MIDRANGE |
sets the variable to the middle value: . |
MAX |
sets the variable to its maximum value. |
Specifying a keyword immediately after AT sets the default value of all variables; for example, AT MIN
sets all variables not displayed on an axis to their minimum values. By default, continuous variables are set to their means
(AT MEAN
) when they are not used on an axis. For example, if your model contains variables X1
, X2
, and X3
, then specifying AT(X1=7 9)
produces a contour plot of X2
versus X3
fixing X1
= 7 and then another contour plot with X1
= 9, along with contour plots of X1
versus X2
fixing X3
at its mean, and X1
versus X3
fixing X2
at its mean.
extends the surface value-times the range of each factor in each direction, which enables you to see more of the fitted surface. For example, if factor
A
has range [0, 10], then specifying EXTEND=0.1
will compute and display the surface for A
in [-1, 11]. You can specify value 0; by default, value = 0.1.
produces a filled contour plot for either the predicted values or the standard errors. FILL=SE is the default. If the 3D option is also specified, then the contour plot is projected onto the surface.
creates an n n grid of points at which the estimated values for the surface and standard errors are computed, for n 1. By default, n = 50.
produces a contour line plot for either the predicted values or the standard errors. LINE=PRED is the default. If the 3D option is also specified, then specifying LINE displays a grid on the surface, and the other LINE= specifications are ignored.
suppresses the display of the design points on the contour surface plots and the overlaid contour-line plots.
produces overlaid contour line plots for all pairs of response variables in addition to the contour surface plots. See Figure 87.6 for an example of this plot.
rotates the 3-D surface plots angle degrees, –180 < angle < 180. By default, angle = 57.
tilts the 3-D surface plots angle degrees, –180 < angle < 180. By default, angle = 20.
suppresses paneling, and displays each surface plot separately.