BOXCHART Statement: SHEWHART Procedure

Saving Summary Statistics

Note: See Box Chart Examples in the SAS/QC Sample Library.

In this example, the BOXCHART statement is used to create a summary data set that can be read later by the SHEWHART procedure (as in the preceding example). The following statements read measurements from the data set Turbine and create a summary data set named Turbhist:

title 'Summary Data Set for Power Output';
proc shewhart data=Turbine;
   boxchart KWatts*Day / outhistory = Turbhist
                         nochart;
run;

The OUTHISTORY= option names the output data set, and the NOCHART option suppresses the display of the chart, which would be identical to the chart in Figure 17.4.

Figure 17.8 contains a partial listing of Turbhist.

Figure 17.8: The Summary Data Set Turbhist

Summary Data Set for Power Output

Obs Day KWattsL KWatts1 KWattsX KWattsM KWatts3 KWattsH KWattsS KWattsN
1 04JUL 3180 3340.0 3487.40 3490.0 3610.0 4050 220.260 20
2 05JUL 3179 3333.5 3471.65 3419.5 3605.0 3849 210.427 20
3 06JUL 3304 3376.0 3488.30 3456.5 3604.5 3781 147.025 20
4 07JUL 3045 3390.5 3434.20 3447.0 3550.0 3629 157.637 20
5 08JUL 2968 3321.0 3475.80 3487.0 3611.5 3916 258.949 20


There are nine variables in the data set Turbhist.

  • Day is the subgroup variable.

  • KWattsL contains the subgroup minimums.

  • KWatts1 contains the first quartiles for each subgroup.

  • KWattsX contains the subgroup means.

  • KWattsM contains the subgroup medians.

  • KWatts3 contains the third quartiles for each subgroup.

  • KWattsH contains the subgroup maximums.

  • KWattsS contains the subgroup standard deviations.

  • KWattsN contains the subgroup sample sizes.

Note that the summary statistic variables are named by adding the suffix characters L, 1, X, M, 3, H, S, and N to the process KWatts specified in the BOXCHART statement. In other words, the variable naming convention for OUTHISTORY= data sets is the same as that for HISTORY= data sets.

If you specify the RANGES option, the OUTHISTORY= data set includes a subgroup range variable, rather than a subgroup standard deviation variable, as demonstrated by the following statements:

proc shewhart data=Turbine;
   boxchart KWatts*Day / outhistory = Turbhist2
                         ranges
                         nochart;
run;

Figure 17.9 contains a partial listing of Turbhist2. The variable KWattsR contains the subgroup ranges.

The RANGES option is not recommended when the subgroup sample sizes are greater than 10, nor when you use the NOLIMITS option to create standard side-by-side box-and-whisker plots.

For more information, see OUTHISTORY= Data Set.

Figure 17.9: The Summary Data Set Turbhist2

Summary Data Set for Power Output

Day KWattsL KWatts1 KWattsX KWattsM KWatts3 KWattsH KWattsR KWattsN
04JUL 3180 3340.0 3487.40 3490.0 3610.0 4050 870 20
05JUL 3179 3333.5 3471.65 3419.5 3605.0 3849 670 20
06JUL 3304 3376.0 3488.30 3456.5 3604.5 3781 477 20
07JUL 3045 3390.5 3434.20 3447.0 3550.0 3629 584 20
08JUL 2968 3321.0 3475.80 3487.0 3611.5 3916 948 20