To illustrate the use of the
different types of plot statements, consider the following template.
In this template, named MODELFIT, a SCATTERPLOT is overlaid with a
REGRESSIONPLOT. The REGRESSIONPLOT is a computed plot because it takes
the input columns (HEIGHT and WEIGHT) and transforms them into two
new columns that correspond to points on the requested fit line. By
default, a linear regression (DEGREE=1) is performed with other statistical
defaults. The model in this case is WEIGHT=HEIGHT, which in the plot
statement is specified with
X=HEIGHT
(independent
variable) and
Y=WEIGHT
(dependent variable).
The number of observations generated for the fit line is around 200
by default.
Note: Plot statements have to be
used in conjunction with Layout statements. To simplify our discussion,
we will continue using the most basic layout statement: LAYOUT OVERLAY.
This layout statement acts as a single container for all plot statements
placed within it. Every plot is drawn on top of the previous one
in the order in which the plot statements are specified, with the
last one drawn on top.
proc template;
define statgraph modelfit;
begingraph;
entrytitle "Regression Fit Plot";
layout overlay;
scatterplot x=height y=weight /
primary=true;
regressionplot x=height y=weight;
endlayout;
endgraph;
end;
run;
proc sgrender data=sashelp.class
template=modelfit;
run;
The REGRESSIONPLOT statement can
also generate sets of points for the upper and lower confidence limits
of the mean (CLM), and for the upper and lower confidence limits of
individual predicted values (CLI) for each observation. The CLM="
name" and CLI="
name" options cause the extra computation. However, the confidence limits
are not displayed by the regression plot. Instead, you must use the
dependent plot statement MODELBAND, with the unique name as its required
argument. Notice that the MODELBAND statement appears first in the
template, ensuring that the band appears behind the scatter points
and fit line. A MODELBAND statement must be used in conjunction with
a REGRESSIONPLOT, LOESSPLOT, or PBSPLINEPLOT statement.
layout overlay;
modelband "myclm" ;
scatterplot x=height y=weight /
primary=true;
regressionplot x=height y=weight /
alpha=.01 clm="myclm" ;
endlayout;
This is certainly the easiest way to
construct this type of plot. However, you might want to construct
a similar plot from an analysis by a statistical procedure that has
many more options for controlling the fit. Most procedures create
output data sets that can be used directly to create the plot that
you want. Here is an example of using non-computed, stand-alone plots
to build the fit plot. First choose a procedure to do the analysis.
proc reg data=sashelp.class noprint;
model weight=height / alpha=.01;
output out=predict predicted=p lclm=lclm uclm=uclm;
run; quit;
The output data set,
PREDICT, contains all the variables and observations in SASHELP.CLASS
plus, for each observation, the computed variables P, LCLM, and UCLM.
Now the template can use simple, non-computed SERIESPLOT and BANDPLOT
statements for the presentation of fit line and confidence bands.
proc template;
define statgraph fit;
begingraph;
entrytitle "Regression Fit Plot";
layout overlay;
bandplot x=height
limitupper=uclm
limitlower=lclm /
fillattrs=GraphConfidence;
scatterplot x=height y=weight /
primary=true;
seriesplot x=height y=p /
lineattrs=GraphFit;
endlayout;
endgraph;
end;
run;
proc sgrender data=predict template=fit;
run;