Applicable Two-Sample Tests and Sample Size Computation

Test for the Difference between Two Normal Means

The MODEL=TWOSAMPLEMEAN option in the SAMPLESIZE statement derives the sample size required to test the difference between the means of two normal populations $\mu _{a}$ and $\mu _{b}$ by using the null hypothesis $H_{0}: \theta = 0$ , where $\theta = \mu _{a}-\mu _{b}$ .

At stage k, the MLE for $\theta$ is computed as

$\hat{\theta }_{k} = {\overline{y}}_{ak} - {\overline{y}}_{bk} = \frac{1}{N_{ak}} \sum _{j=1}^{N_{ak}} {{y}_{akj}} - \frac{1}{N_{bk}} \sum _{j=1}^{N_{bk}} {{y}_{bkj}}$

where ${y}_{akj}$ and ${y}_{bkj}$ are the values of the jth observation available in the kth stage groups A and B, respectively, and $N_{ak}$ and $N_{bk}$ are the cumulative sample sizes at stage k for these two groups.

The statistic $\hat{\theta }_{k}$ has a normal distribution

$\hat{\theta }_{k} \sim N \left( \, \theta , \, {I_{k}}^{-1} \right)$

where the information $I_{k}$ is the inverse of the variance $\mr {Var}(\hat{\theta }_{k})= {\sigma }^{2}_{a} / N_{ak} + {\sigma }^{2}_{b} / N_{bk}$ .

Then the standardized statistic

$Z_{k}= \hat{\theta }_{k} \sqrt {I_{k}} \sim N \left( \, \theta \sqrt {I_{k}}, \, 1 \right)$

Thus, to test the hypothesis $H_{0}: \theta = 0$ against an upper alternative $H_{1}: \theta = \theta _1, \theta _1 > 0$ , $H_{0}$ is rejected at stage k if the statistic $Z_{k} \geq a_{k}$ , the upper $\alpha$ boundary for the standardized Z statistic at stage k.

If the variances ${\sigma }^{2}_{a}$ and ${\sigma }^{2}_{b}$ are unknown, the sample variances can be used to derive the information $I_{k}$ if it is assumed that each sample variance is computed from a large sample such that the test statistic has an approximately normal distribution.

The maximum information is needed to derive the required sample size. If the maximum information is not specified or derived in the procedure, the alternative reference $\theta ^{*}_{1}$ specified in the MEANDIFF option is used to derive the maximum information.

Note that in order to derive the sample sizes $N_{ak}$ and $N_{bk}$ uniquely from the information, $N_{ak}= R \, N_{bk}$ is assumed for $k=1, 2, \ldots , K$ , where $R=w_{a}/w_{b}$ is the constant allocation ratio computed from the WEIGHT= $w_{a}$ $w_{b}$ option in the SAMPLESIZE statement.

In PROC SEQDESIGN, the computed total sample sizes for the two groups are

$N_{aK} = ( {\sigma }^{2}_{a} \, + \, R \, {\sigma }^{2}_{b} ) \, I_{X} = R \, ( \frac{{\sigma }^{2}_{a}}{R} \, + \, {\sigma }^{2}_{b} ) \, I_{X}$

$N_{bK} = ( \frac{{\sigma }^{2}_{a}}{R} \, + \, {\sigma }^{2}_{b} ) \, I_{X}$

where $I_{X}$ is the maximum information derived in the SEQDESIGN procedure, R is the constant allocation ratio, and ${\sigma }_{a}$ and ${\sigma }_{b}$ are the specified standard deviations.

For , the two sample sizes are equal, then

$N_{aK} = N_{bK} = \frac{N_{K}}{2} = ({\sigma }^{2}_{a} + {\sigma }^{2}_{b}) \, I_{X}$

If the variances from the two groups are equal, ${\sigma }^{2}_{a} = {\sigma }^{2}_{b} = {\sigma }^{2}$ , then the total sample sizes for the two groups are

$N_{aK} = (1 + R) \, {\sigma }^{2} \, I_{X}$

$N_{bK} = (1 + \frac{1}{R}) \, {\sigma }^{2} \, I_{X}$

and the total sample size is

$N_{X} = N_{aK} + N_{bK} = \frac{(R + 1)^{2}}{R} \, {\sigma }^{2} \, I_{X}$

Furthermore, for , the two sample sizes are equal, then

$N_{aK} = N_{bK} = \frac{N_{X}}{2} = 2 \, {\sigma }^{2} \, I_{X}$

With an available maximum information, you can specify the MODEL=TWOSAMPLEMEAN( WEIGHT= R STDDEV= ${\sigma }_{a} \, \, {\sigma }_{b}$ ) option in the SAMPLESIZE statement to compute the required total sample size and individual sample size at each stage. A procedure such as PROC GLM can be used to derive the two-sample Z test for the mean difference.

Test for the Difference between Two Binomial Proportions

The MODEL=TWOSAMPLEFREQ(TEST=PROP) option in the SAMPLESIZE statement derives the sample size required to test the difference between two binomial populations with $H_{0}: \theta = 0$ , where $\theta = p_{a} - p_{b}$ . At stage k, the MLE for $\theta$ is

$\hat{\theta }_{k} = {\hat{p}}_{ak} - {\hat{p}}_{bk} = \frac{1}{N_{ak}} \sum _{j=1}^{N_{ak}} {{y}_{akj}} - \frac{1}{N_{bk}} \sum _{j=1}^{N_{bk}} {{y}_{bkj}}$

where ${y}_{akj}$ and ${y}_{bkj}$ are the values of the jth observation available in the kth stage for groups A and B, respectively, and $N_{ak}$ and $N_{bk}$ are the cumulative sample sizes at stage k for these two groups.

For sufficiently large sample sizes $N_{ak}$ and $N_{bk}$ , the statistic $\hat{\theta }_{k}$ has an approximate normal distribution

${\hat{\theta }}_{k} \sim N \left( \, \theta , \, {I_{k}}^{-1} \right)$

where the information is the inverse of the variance

$\mr {Var}({\hat{\theta }}_{k}) = \frac{p_{a} \, (1-p_{a})}{N_{ak}} + \frac{p_{b} \, (1-p_{b})}{N_{bk}}$

Thus, the standardized statistic

$Z_{k}= \hat{\theta }_{k} \sqrt {I_{k}} \sim N \left( \, \theta \sqrt {I_{k}}, \, 1 \right)$

In practice, $p_{a}= \hat{p}_{a}$ and $p_{b}= \hat{p}_{b}$ , the estimated sample proportions for groups A and B, respectively, at stage k, can be used to derive the information $I_{k}$ and the test statistic $Z_{k}$ . Thus, to test the hypothesis $H_{0}$ against an upper alternative $H_{1}: \theta > 0$ , $H_{0}$ is rejected at stage k if the statistic $Z_{k} \geq a_{k}$ , the upper $\alpha$ boundary for the standardized Z statistic at stage k.

The maximum information $I_{X}$ is needed to derive the required sample size. If the maximum information is not specified or derived with the ALTREF= option in the procedure, the PROP= option in the SAMPLESIZE statement is used to provide proportions under the alternative hypothesis for the alternative reference and then to derive the maximum information.

The proportions in the two groups are needed to derive the sample size. Also, in order to derive the sample sizes $N_{ak}$ and $N_{bk}$ uniquely from the information, $N_{ak}= R \, N_{bk}$ is assumed for $k=1, 2, \ldots , K$ , where $R=w_{a}/w_{b}$ is the constant allocation ratio computed from the WEIGHT= $w_{a}$ $w_{b}$ option in the SAMPLESIZE statement. Then

$I_{X}= { \left( \, \frac{{p}_{a} \, (1-{p}_{a})}{N_{aK}} + \frac{{p}_{b} \, (1-{p}_{b})}{N_{bK}} \, \right) }^{-1} = \frac{N_{aK}}{{p}_{a} (1-{p}_{a}) + R \, {p}_{b} (1-{p}_{b})}$

In PROC SEQDESIGN, the total sample sizes in the two groups are computed as

$N_{aK} = \left( \, p^{*}_{a} \, (1-p^{*}_{a}) + R \, p^{*}_{b} \, (1-p^{*}_{b}) \, \right) \, I_{X}$

$N_{bK} = \frac{1}{R} \, N_{aK}$

where $R=w_{a}/w_{b}$ is the constant allocation ratio, and $p^{*}_{a}$ and $p^{*}_{b}$ are proportions specified with the REF= option:

REF=NULLPROP uses proportions under $H_{0}$ : $p^{*}_{a}= p_{0a}$ , $p^{*}_{b}= p_{0b}$
REF=AVGNULLPROP uses the average proportion under $H_{0}$ : $p^{*}_{a}= p^{*}_{b}= (R p_{0a} + p_{0b}) / (R+1)$
REF=PROP uses proportions under $H_{1}$ : $p^{*}_{a}= p_{1a}$ , $p^{*}_{b}= p_{1b}$
REF=AVGPROP uses the average proportion under $H_{1}: p^{*}_{a}= p^{*}_{b}= (R p_{1a} + p_{1b}) / (R+1)$

The total sample size is given by

$N_{X} = N_{aK} + N_{bK} = ( R + 1 ) \left( \, \frac{1}{R} \, p^{*}_{a} \, (1-p^{*}_{a}) + p^{*}_{b} \, (1-p^{*}_{b}) \right) \, I_{X}$

For , the two sample sizes are equal,

$N_{aK} = N_{bK} = \frac{N_{X}}{2} = \left( \, p^{*}_{a} \, (1-p^{*}_{a}) + p^{*}_{b} \, (1-p^{*}_{b}) \, \right) \, I_{X}$

You can specify the MODEL=TWOSAMPLEFREQ( TEST=PROP WEIGHT=R ) option in the SAMPLESIZE statement to compute the required total sample size and individual sample size at each stage. A procedure such as PROC GENMOD with the default DIST=NORMAL option in the MODEL statement can be used to derive the two-sample Z test for proportion difference.

Test for Two Binomial Proportions with a Log Odds Ratio Statistic

The MODEL=TWOSAMPLEFREQ(TEST=LOGOR) option in the SAMPLESIZE statement derives the sample size required to test two binomial proportions by using a log odds ratio statistic. The odds ratio is the ratio of the odds in one group to the odds in the other group, and the log odds ratio is the logarithm of the odds ratio

$\theta = \mr {log} \left( \frac{p_{a} / (1-p_{a})}{p_{b} / (1-p_{b})} \right) = \mr {log} \left( \frac{p_{a} (1-p_{b})}{p_{b} (1-p_{a})} \right)$

The hypothesis of no difference between two proportions, $p_{a} = p_{b}$ , can be tested through the null hypothesis $H_{0}: \theta = 0$ , where $\theta$ is the log odds ratio. For example, with $H_{0}: p_{a}= p_{b}= 0.6$ and $H_{1}: p_{a}= 0.8, \, p_{b}= 0.6$ , it corresponds to the equivalent hypothesis $H_{0}: \theta = 0$ and $H_{1}: \theta = \mr {log} \left( \frac{0.8 (1-0.6)}{0.6 (1-0.8)} \right) = \mr {log} ( 8/3 ) = 0.98083$ .

The maximum likelihood estimate of $\theta$ is given by

$\hat\theta = \mr {log} \left( \frac{\hat p_{a} (1-\hat p_{b})}{\hat p_{b} (1-\hat p_{a})} \right)$

with an asymptotic variance

$\mr {Var} (\hat\theta ) = I^{-1} = \frac{1}{N_{a} p_{a} (1-p_{a})} + \frac{1}{N_{b} p_{b} (1-p_{b})}$

where I is the information (Diggle et al., 2002, pp. 341–342). That is, the standardized statistic

$Z_{k}= \hat{\theta }_{k} \sqrt {I_{k}} \sim N \left( \, \theta \sqrt {I_{k}}, \, 1 \right)$

In practice, $p_{a}= \hat{p}_{a}$ and $p_{b}= \hat{p}_{b}$ , the estimated sample proportions for groups A and B, respectively, at stage k, can be used to derive the information $I_{k}$ and the test statistic $Z_{k}= \hat{\theta }_{k} \sqrt {I_{k}}$ if the two sample sizes $N_{a}$ and $N_{b}$ are sufficiently large such that the test statistic has an approximately normal distribution.

The maximum information $I_{X}$ is needed to derive the required sample size. If the maximum information is not specified or derived with the ALTREF= option in the procedure, the PROP= option in the SAMPLESIZE statement is used to provide proportions under the alternative hypothesis for the alternative reference and then to derive the maximum information.

In order to derive the sample sizes $N_{ak}$ and $N_{bk}$ uniquely from the information, $N_{ak}= R \, N_{bk}$ is assumed for $k=1, 2, \ldots , K$ , where $R=w_{a}/w_{b}$ is the constant allocation ratio computed from the WEIGHT= $w_{a}$ $w_{b}$ option in the SAMPLESIZE statement. Then with

$I_{X} = N_{bK} \, { \left( \, \frac{1}{R \, p_{a} (1- p_{a})} + \frac{1}{ p_{b} (1- p_{b})} \right) }^{-1}$

the sample size can be computed.

In PROC SEQDESIGN, the total sample sizes in the two groups are computed as

$N_{bK} = I_{X} \, \left( \, \frac{1}{R \, p^{*}_{a} (1-p^{*}_{a})} + \frac{1}{p^{*}_{b} (1-p^{*}_{b})} \right)$

$N_{aK} = R \, N_{bK}$

where $R=w_{a}/w_{b}$ is the constant allocation ratio, and $p^{*}_{a}$ and $p^{*}_{b}$ are proportions specified with the REF= option:

REF=NULLPROP uses proportions under $H_{0}$ : $p^{*}_{a}= p_{0a}$ , $p^{*}_{b}= p_{0b}$
REF=AVGNULLPROP uses the average proportion under $H_{0}$ : $p^{*}_{a}= p^{*}_{b}= (R p_{0a} + p_{0b}) / (R+1)$
REF=PROP uses proportions under $H_{1}$ : $p^{*}_{a}= p_{1a}$ , $p^{*}_{b}= p_{1b}$
REF=AVGPROP uses the average proportion under $H_{1}: p^{*}_{a}= p^{*}_{b}= (R p_{1a} + p_{1b}) / (R+1)$

You can specify the MODEL=TWOSAMPLEFREQ( TEST=LOGOR WEIGHT=R) option in the SAMPLESIZE statement to compute the required total sample size and individual sample size at each stage. A procedure such as PROC LOGISTIC can be used to derive the log odds ratio statistic.

Test for Two Binomial Proportions with a Log Relative Risk Statistic

The MODEL=TWOSAMPLEFREQ(TEST=LOGRR) option in the SAMPLESIZE statement derives the sample size required to test two binomial proportions by using a log relative risk statistic. The relative risk is the ratio of the proportion in one group to the proportion in the other group. The log relative risk statistic is the logarithm of the relative risk

$\theta = \mr {log} \left( \frac{p_{a}}{p_{b}} \right)$

The hypothesis of no difference between two proportions, $p_{a} = p_{b}$ , can be tested through the null hypothesis $H_{0}: \theta = 0$ . For example, with $H_{0}: p_{a}= p_{b}= 0.6$ and $H_{1}: p_{a}= 0.8, \, p_{b}= 0.6$ , it corresponds to the equivalent hypothesis $H_{0}: \theta = 0$ and $H_{1}: \theta = \mr {log} \left( \frac{0.8}{0.6} \right) = \mr {log} ( 4/3 ) = 0.28768$ .

The maximum likelihood estimate of $\theta$ is given by

$\hat\theta = \mr {log} \left( \frac{\hat p_{a}}{\hat p_{b}} \right)$

with an asymptotic variance

$I^{-1} = \frac{1-p_{a}}{N_{a} \, p_{a}} + \frac{1-p_{b}}{N_{b} \, p_{b}}$

where I is the information (Chow and Liu, 1998, p. 329).

In practice, $p_{a}= \hat{p}_{a}$ and $p_{b}= \hat{p}_{b}$ , the estimated sample proportions for groups A and B, respectively, at stage k, are used to derive the information $I_{k}$ and the test statistic $Z_{k}= \hat{\theta }_{k} \sqrt {I_{k}}$ .

The maximum information $I_{X}$ and proportions $p_{a}$ and $p_{b}$ are needed to derive the required sample size. If the maximum information is not specified or derived with the ALTREF= option in the procedure, the PROP= option in the SAMPLESIZE statement is used to provide proportions under the alternative hypothesis for the alternative reference and then to derive the maximum information.

Note that in order to derive the sample sizes $N_{ak}$ and $N_{bk}$ uniquely from the information, $N_{ak}= R \, N_{bk}$ is assumed for $k=1, 2, \ldots , K$ , where $R=w_{a}/w_{b}$ is the constant allocation ratio computed from the WEIGHT= $w_{a}$ $w_{b}$ option in the SAMPLESIZE statement. Then the sample size can be computed from

$I_{X} = N_{bK} \, { \left( \, \frac{1- p_{a}}{R \, p_{a}} + \frac{1- p_{b}}{ p_{b}} \right) }^{-1}$

In PROC SEQDESIGN, the computed sample sizes in the two groups are

$N_{bK} = I_{X} \, \left( \, \frac{1-p^{*}_{a}}{R \, p^{*}_{a}} + \frac{1-p^{*}_{b}}{p^{*}_{b}} \right)$

$N_{aK} = R \, N_{bK}$

where $R=w_{a}/w_{b}$ is the constant allocation ratio, and $p^{*}_{a}$ and $p^{*}_{b}$ are proportions specified with the REF= option:

REF=NULLPROP uses proportions under $H_{0}$ : $p^{*}_{a}= p_{0a}$ , $p^{*}_{b}= p_{0b}$
REF=AVGNULLPROP uses the average proportion under $H_{0}$ : $p^{*}_{a}= p^{*}_{b}= (R p_{0a} + p_{0b}) / (R+1)$
REF=PROP uses proportions under $H_{1}$ : $p^{*}_{a}= p_{1a}$ , $p^{*}_{b}= p_{1b}$
REF=AVGPROP uses the average proportion under $H_{1}: p^{*}_{a}= p^{*}_{b}= (R p_{1a} + p_{1b}) / (R+1)$

You can specify the MODEL=TWOSAMPLEFREQ( TEST=LOGRR WEIGHT=R) option in the SAMPLESIZE statement to compute the required total sample size and individual sample size at each stage. A procedure such as PROC LOGISTIC can be used to derive the log relative risk statistic.

Test for Two Survival Distributions with a Log-Rank Test

The MODEL=TWOSAMPLESURV option in the SAMPLESIZE statement derives the number of events required for a log-rank test of two survival distributions. The analysis of survival data involves the survival times for both censored and uncensored data. A noncensored survival time is the time from treatment to an event such as remission or relapse for an individual. A censored survival time is the time from treatment to the time of analysis for an individual surviving at that time, and the status is unknown beyond that time.

Let T be the random variable of the survival time. Then the survival function

$S(t) = \Pr (T > t)$

is the probability that an individual from the population has a survival time that exceeds t. And the hazard function is given by

$h(t) = \frac{f(t)}{S(t)}$

where is the density function of T.

The hazard functions can be used to test the equality of two survival distributions $S_{a}(t) = S_{b}(t)$ with the null hypothesis $H_{0}: h_{a}(t) = h_{b}(t), t > 0$ , where $S_{a}(t)$ and $S_{b}(t)$ are survival functions for groups A and B, respectively, and $h_{a}(t)$ and $h_{b}(t)$ are the corresponding hazard functions.

If the two hazards are proportional, $h_{a}(t)= \lambda \, h_{b}(t)$ , where $\lambda$ is a constant, then an equivalent null hypothesis is

$H_{0}: \lambda = \frac{h_{a}(t)}{h_{b}(t)} = 1$

Alternatively, another equivalent null hypothesis is given by

$H_{0}: \theta = -\mr {log}(\lambda ) = 0$

Suppose that the hazard rate h is a constant. Then with a specified median survival time $T_{m}$ , the hazard rate can be derived from the equation

$e^{- h \, T_{m}} = \frac{1}{2}$

Denote the distinct event times at stage k as $\tau _{kj}, j=1, 2, \ldots , t_{k}$ , where $t_{k}$ is the total number of distinct event times. Then the score statistic is the log-rank statistic (Jennison and Turnbull 2000, pp. 259–261; Whitehead 1997, pp. 36–39)

$S_{k} = \sum _{j=1}^{t_{k}} ( d_{akj} - e_{akj} )$

where $d_{akj}$ is the number of events from group A and $e_{akj}$ is the number of expected events from A. The number of expected events from $\Variable{A}$ is computed as

$e_{akj} = d_{kj} \frac{r_{akj}}{r_{kj}}$

where $d_{kj}$ is the number of events from both groups, $r_{akj}$ is the number of individuals from the treatment group who survived up to time $\tau _{kj}$ , and $r_{kj}$ is the number of individuals from both groups who survived up to time $\tau _{kj}$ .

If the number of events $d_{kj}$ is small relative to $r_{kj}$ , the number of individuals survived up to time $\tau _{kj}$ , then with a sufficiently large sample size, $S_{k}$ has an approximately normal distribution

$S_{k} \sim N( \theta \, I_{k}, \, \, I_{k})$

where the variance of $S_{k}$ is the estimated information

$I_{k} = \sum _{j=1}^{t_{k}} \frac{ r_{akj} \, r_{bkj} \, d_{kj} }{r_{kj}^{2}}$

In order to derive the number of events from the information $I_{k}$ , $N_{ak}= R \, N_{bk}$ is assumed for $k=1, 2, \ldots , K$ , where $R=w_{a}/w_{b}$ is the constant allocation ratio computed from the WEIGHT= $w_{a}$ $w_{b}$ option in the SAMPLESIZE statement.

The maximum information $I_{X}$ is needed to derive the required sample size. If the maximum information is specified or derived with the ALTREF= option in the procedure, the HAZARD=, MEDSURVTIME=, and HAZARDRATIO= options are not applicable. Otherwise, the HAZARD=, MEDSURVTIME=, or HAZARDRATIO= option is used to compute the alternative reference and then to derive the maximum information for the sample size calculation.

With $N_{aK}= R \, N_{bK}$ , if the number of events is few relative to the number of individuals who survived, then $r_{aKj} \approx R \, r_{bKj}$ , and

$I_{X} \approx \sum _{j=1}^{t_{K}} \frac{R}{(R+1)^{2}} d_{Kj} = \frac{R}{(R+1)^{2}} \, D_{X}$

where $D_{X}$ is the total number of events.

Thus, the required total number of events

$D_{X} = \frac{(R+1)^{2}}{R} \, I_{X}$

For a study group, if the hazard rate is constant, corresponding to an exponential survival distribution, and the individual accrual is uniform in the accrual time $T_{a}$ with a constant accrual rate $r_{a}$ , then the required total sample size and sample size at each stage can be derived. See the section Input Number of Events for Fixed-Sample Design for a detailed description of the sample size computation that uses hazard rates, accrual rate, and accrual time.

You can specify the MODEL=TWOSAMPLESURVIVAL option in the SAMPLESIZE statement to compute the required total number of events and individual number of events at each stage. With the specifications of hazard rates, accrual rate, and accrual time, the required total sample size and individual sample size at each stage can also be derived. If the REF=NULLHAZARD option is specified, the hazard rates under the null hypothesis, $h_{0a}$ and $h_{0b}$ , are used in the sample size computation. Otherwise, the hazard rates under the alternative hypothesis, $h_{1a}$ and $h_{1b}$ , are used. A procedure such as PROC LIFETEST can be used to derive the log-rank statistic.

The SEQDESIGN Procedure