SCHART Statement: SHEWHART Procedure

Methods for Estimating the Standard Deviation

When control limits are determined from the input data, three methods (referred to as default, MVLUE, and RMSDF) are available for estimating $\sigma $.

Default Method

The default estimate for $\sigma $ is

\[ \hat{\sigma } = \frac{s_{1}/c_{4}(n_{1})+ \cdots + s_{N}/c_{4}(n_{N})}{N} \]

where N is the number of subgroups for which $n_{i} \geq 2$, $s_{i}$ is the sample standard deviation of the ith subgroup

\[  s_{i} = \sqrt { \frac{1}{n_{i} - 1} \sum ^{n_ i}_{j=1}(x_{ij}-\bar{X}_{i})^{2}} \]

and

\[  c_{4}(n_{i}) = \frac{\Gamma (n_{i}/2)\sqrt {2/(n_{i}-1)}}{\Gamma ((n_{i}-1)/2)}  \]

Here $\Gamma (\cdot )$ denotes the gamma function, and $\bar{X}_{i}$ denotes the ith subgroup mean. A subgroup standard deviation $s_{i}$ is included in the calculation only if $n_{i} \geq 2$. If the observations are normally distributed, then the expected value of $s_{i}$ is $c_{4}(n_{i})\sigma $. Thus, $\hat{\sigma }$ is the unweighted average of N unbiased estimates of $\sigma $. This method is described in the American Society for Testing and Materials (1976).

MVLUE Method

If you specify SMETHOD=MVLUE, a minimum variance linear unbiased estimate (MVLUE) is computed for $\sigma $. Refer to Burr (1969, 1976) and Nelson (1989, 1994). This estimate is a weighted average of N unbiased estimates of $\sigma $ of the form $s_ i/c_4(n_ i)$, and it is computed as

\[ \hat{\sigma } = \frac{h_{1}s_{1}/c_{4}(n_{1})+ \cdots + h_{N}s_{N}/c_{4}(n_{N})}{h_1 + \cdots + h_ N}  \]

where

\[ h_ i = \frac{[c_4(n_ i)]^{2}}{1 - [c_4(n_ i)]^{2}} \]

A subgroup standard deviation $s_ i$ is included in the calculation only if $n_ i \geq 2$, and N is the number of subgroups for which $n_{i} \geq 2$. The MVLUE assigns greater weight to estimates of $\sigma $ from subgroups with larger sample sizes, and it is intended for situations where the subgroup sample sizes vary. If the subgroup sample sizes are constant, the MVLUE reduces to the default estimate.

RMSDF Method

If you specify SMETHOD=RMSDF, a weighted root-mean-square estimate is computed for $\sigma $:

\[  \hat{\sigma } = \frac{\sqrt {(n_{1} - 1)s_1^{2} + \cdots + (n_{N} - 1)s_{N}^{2}}}{c_{4}(n)\sqrt {n_{1} + \cdots + n_{N} - N}}  \]

where $n = n_1 + \cdots + n_ N - (N - 1)$. The weights are the degrees of freedom $n_{i} - 1$. A subgroup standard deviation $s_{i}$ is included in the calculation only if $n_{i} \geq 2$, and N is the number of subgroups for which $n_{i} \geq 2$.

If the unknown standard deviation $\sigma $ is constant across subgroups, the root-mean-square estimate is more efficient than the minimum variance linear unbiased estimate. However, in process control applications, it is generally not assumed that $\sigma $ is constant, and if $\sigma $ varies across subgroups, the root-mean-square estimate tends to be more inflated than the MVLUE.