Note: See EWMA Chart with Unequal Subgroup Sample Sizes in the SAS/QC Sample Library.
This example contains measurements from the metal clip manufacturing process (introduced in Creating EWMA Charts from Raw Data). The following statements create a SAS data set named Clips4
, which contains additional clip gap measurements taken on a daily basis:
data Clips4; input Day @; length Dayc $2.; informat Day ddmmyy8.; format Day date5.; Dayc=put(Day,date5.); Dayc=substr(Dayc,1,2); do i=1 to 5; input Gap @; output; end; drop i; label Dayc='April'; datalines; 1/4/86 14.93 14.65 14.87 15.11 15.18 2/4/86 15.06 14.95 14.91 15.14 15.41 3/4/86 14.90 14.90 14.96 15.26 15.18 4/4/86 15.25 14.57 15.33 15.38 14.89 7/4/86 14.68 14.63 14.72 15.32 14.86 8/4/86 14.48 14.88 14.98 14.74 15.48 9/4/86 14.99 15.16 15.02 15.53 14.66 10/4/86 14.88 15.44 15.04 15.10 14.89 11/4/86 15.14 15.33 14.75 15.23 14.64 14/4/86 15.46 15.30 14.92 14.58 14.68 15/4/86 15.23 14.63 . . . 16/4/86 15.13 15.25 . . . 17/4/86 15.06 15.25 15.28 15.30 15.34 18/4/86 15.22 14.77 15.12 14.82 15.29 21/4/86 14.95 14.96 14.65 14.87 14.77 22/4/86 15.01 15.11 15.11 14.79 14.88 23/4/86 14.97 15.50 14.93 15.13 15.25 24/4/86 15.23 15.21 15.31 15.07 14.97 25/4/86 15.08 14.75 14.93 15.34 14.98 28/4/86 15.07 14.86 15.42 15.47 15.24 29/4/86 15.27 15.20 14.85 15.62 14.67 30/4/86 14.97 14.73 15.09 14.98 14.46 ;
Note that only two gap measurements were recorded on April 15 and April 16.
A partial listing of Clips4
is shown in Output 9.3.1. This data set contains three variables: Day
is a numeric variable that contains the date (month, day, and year) that the measurement is taken, Dayc
is a character variable that contains the day the measurement is taken, and Gap
is a numeric variable that contains the measurement.
Output 9.3.1: The Data Set Clips4
The Data Set Clips4 |
Day | Dayc | Gap |
---|---|---|
01APR | 01 | 14.93 |
01APR | 01 | 14.65 |
01APR | 01 | 14.87 |
01APR | 01 | 15.11 |
01APR | 01 | 15.18 |
02APR | 02 | 15.06 |
02APR | 02 | 14.95 |
02APR | 02 | 14.91 |
02APR | 02 | 15.14 |
02APR | 02 | 15.41 |
03APR | 03 | 14.90 |
03APR | 03 | 14.90 |
03APR | 03 | 14.96 |
03APR | 03 | 15.26 |
03APR | 03 | 15.18 |
The following statements request an EWMA chart, shown in Output 9.3.2, for these gap measurements:
ods graphics off; symbol v=dot h=1.5; title 'EWMA Chart for Gap Measurements'; proc macontrol data=Clips4; ewmachart Gap*Dayc / weight = 0.3; run;
The character variable Dayc
(rather than the numeric variable Day
) is specified as the subgroup-variable in the preceding EWMACHART statement. If Day
were the subgroup-variable, each day during April would appear on the horizontal axis, including the weekend days of April 5 and April 6 for which no
measurements were taken. To avoid this problem, the subgroup-variable Dayc
is created from Day
using the PUT and SUBSTR function. Since Dayc
is a character subgroup-variable, a discrete axis is used for the horizontal axis, and as a result, April 5 and April 6 do not appear on the horizontal axis
in Output 9.3.2. A LABEL statement is used to specify the label April for the horizontal axis, indicating the month that these measurements were taken.
Note that the control limits vary with the subgroup sample size. The sample size legend in the lower left corner displays the minimum and maximum subgroup sample sizes.
The EWMACHART statement provides various options for working with unequal subgroup sample sizes. For example, you can use the LIMITN= option to specify a fixed (nominal) sample size for computing control limits, as illustrated by the following statements:
title 'EWMA Chart for Gap Measurements'; proc macontrol data=Clips4; ewmachart Gap*Dayc / weight = 0.3 limitn = 5; run;
The resulting chart is shown in Output 9.3.3.
Note that the only points displayed are those corresponding to subgroups whose sample size matches the nominal sample size of five. Therefore, points are not displayed for April 15 and April 16. To plot points for all subgroups (regardless of subgroup sample size), you can specify the ALLN option, as follows:
title 'EWMA Chart for Gap Measurements'; proc macontrol data=Clips4; ewmachart Gap*Dayc/ weight = 0.3 limitn = 5 alln nmarkers; run;
The chart is shown in Output 9.3.4. The NMARKERS option requests special symbols to identify points for which the subgroup sample size differs from the nominal sample size.
You can use the SMETHOD= option to determine how the process standard deviation is to be estimated when the subgroup sample sizes vary. The default method computes as an unweighted average of subgroup estimates of . Specifying SMETHOD=MVLUE requests a minimum variance linear unbiased estimate (MVLUE), which assigns greater weight to estimates of from subgroups with larger sample sizes. Specifying SMETHOD=RMSDF requests a weighted root-mean-square estimate. If the unknown standard deviation is constant across subgroups, the root-mean-square estimate is more efficient than the MVLUE. For more information, see Methods for Estimating the Standard Deviation.
The following statements apply all three methods:
proc macontrol data=Clips4; ewmachart Gap*Dayc / outlimits = Cliplim1 outindex = 'Default' weight = 0.3 nochart; ewmachart Gap*Dayc / smethod = mvlue outlimits = Cliplim2 outindex = 'MVLUE' weight = 0.3 nochart; ewmachart Gap*Dayc / smethod = rmsdf outlimits = Cliplim3 outindex = 'RMSDF' weight = 0.3 nochart; run; data Climits; set Cliplim1 Cliplim2 Cliplim3; run;
The data set Climits
is listed in Output 9.3.5.
Output 9.3.5: Listing of the Data Set Climits
Estimating the Process Standard Deviation |
_VAR_ | _SUBGRP_ | _INDEX_ | _TYPE_ | _LIMITN_ | _ALPHA_ | _SIGMAS_ | _MEAN_ | _STDDEV_ | _WEIGHT_ |
---|---|---|---|---|---|---|---|---|---|
Gap | Dayc | Default | ESTIMATE | V | .002699796 | 3 | 15.0354 | 0.26503 | 0.3 |
Gap | Dayc | MVLUE | ESTIMATE | V | .002699796 | 3 | 15.0354 | 0.26096 | 0.3 |
Gap | Dayc | RMSDF | ESTIMATE | V | .002699796 | 3 | 15.0354 | 0.25959 | 0.3 |
Note that the estimate of the process standard deviation (stored in the variable _STDDEV_
) is slightly different depending on the estimation method. The variable _LIMITN_
is assigned the special missing value V in the OUTLIMITS= data set, indicating that the subgroup sample sizes vary.