The following data, from Brown and Fears (1981), are the results of an 80-week carcinogenesis bioassay with female mice. Six tissue sites are examined at necropsy; 1 indicates
the presence of a tumor and 0 the absence. A frequency variable Freq
is included. A control and four different doses of a drug (in parts per milliliter) make up the levels of the grouping variable
Dose
.
data a; input Liver Lung Lymph Cardio Pitui Ovary Freq Dose$ @@; datalines; 1 0 0 0 0 0 8 CTRL 0 1 0 0 0 0 7 CTRL 0 0 1 0 0 0 6 CTRL 0 0 0 1 0 0 1 CTRL 0 0 0 0 0 1 2 CTRL 1 1 0 0 0 0 4 CTRL 1 0 1 0 0 0 1 CTRL 1 0 0 0 0 1 1 CTRL 0 1 1 0 0 0 1 CTRL 0 0 0 0 0 0 18 CTRL 1 0 0 0 0 0 9 4PPM 0 1 0 0 0 0 4 4PPM 0 0 1 0 0 0 7 4PPM 0 0 0 1 0 0 1 4PPM 0 0 0 0 1 0 2 4PPM 0 0 0 0 0 1 1 4PPM 1 1 0 0 0 0 4 4PPM 1 0 1 0 0 0 3 4PPM 1 0 0 0 1 0 1 4PPM 0 1 1 0 0 0 1 4PPM 0 1 0 1 0 0 1 4PPM 1 0 1 1 0 0 1 4PPM 0 0 0 0 0 0 15 4PPM 1 0 0 0 0 0 8 8PPM 0 1 0 0 0 0 3 8PPM 0 0 1 0 0 0 6 8PPM 0 0 0 1 0 0 3 8PPM 1 1 0 0 0 0 1 8PPM 1 0 1 0 0 0 2 8PPM 1 0 0 1 0 0 1 8PPM 1 0 0 0 1 0 1 8PPM 1 1 0 1 0 0 2 8PPM 1 1 0 0 0 1 2 8PPM 0 0 0 0 0 0 19 8PPM 1 0 0 0 0 0 4 16PPM 0 1 0 0 0 0 2 16PPM 0 0 1 0 0 0 9 16PPM 0 0 0 0 1 0 1 16PPM 0 0 0 0 0 1 1 16PPM 1 1 0 0 0 0 4 16PPM 1 0 1 0 0 0 1 16PPM 0 1 1 0 0 0 1 16PPM 0 1 0 1 0 0 1 16PPM 0 1 0 0 0 1 1 16PPM 0 0 1 1 0 0 1 16PPM 0 0 1 0 1 0 1 16PPM 1 1 1 0 0 0 2 16PPM 0 0 0 0 0 0 14 16PPM 1 0 0 0 0 0 8 50PPM 0 1 0 0 0 0 4 50PPM 0 0 1 0 0 0 8 50PPM 0 0 0 1 0 0 1 50PPM 0 0 0 0 0 1 4 50PPM 1 1 0 0 0 0 3 50PPM 1 0 1 0 0 0 1 50PPM 0 1 1 0 0 0 1 50PPM 0 1 0 0 1 1 1 50PPM 0 0 0 0 0 0 19 50PPM ;
proc multtest data=a order=data notables out=p permutation nsample=1000 seed=764511; test fisher(Liver Lung Lymph Cardio Pitui Ovary / lowertailed); class Dose; freq Freq; run; proc print data=p; run;
In the PROC MULTTEST statement, the ORDER=
DATA option is required to keep the levels of Dose
in the order in which they appear in the data set. Without this option, the levels are sorted by their formatted value, resulting
in an alphabetic ordering. The NOTABLES
option suppresses the display of summary statistics, and the OUT=
option produces an output data set p
containing the p-values. The PERMUTATION
option specifies permutation resampling, NSAMPLE=
1000 requests 1000 samples, and SEED=
764511 option provides a starting value for the random number generator. You should specify a seed if you need to duplicate
resampling results.
To test for higher rates of tumor occurrence in the treatment groups compared to the control group, the LOWERTAILED option is specified in the FISHER option of the TEST statement to produce a lower-tailed Fisher exact test for the six tissue sites. The Fisher test is appropriate for comparing a treatment and a control, but multiple testing can be a problem. Brown and Fears (1981) use a multivariate permutation to evaluate the entire collection of tests. PROC MULTTEST adjusts the p-values by simulation.
The treatments make up the levels of the grouping variable Dose
, listed in the CLASS
statement. Since no CONTRAST
statement is specified, PROC MULTTEST uses the default pairwise contrasts with the first level of Dose
. The FREQ
statement is used since these are summary data containing frequency counts of occurrences.
The results from this analysis are listed in Output 67.4.1 through Output 67.4.4. First, the PROC MULTTEST specifications are displayed in Output 67.4.1.
The default contrasts for the Fisher test are displayed in Output 67.4.2. Note that each dose is compared with the control.
The "p-Values" table in Output 67.4.3 displays p-values for the Fisher exact tests and their permutation-based adjustments.
Output 67.4.3: p-Values
p-Values | |||
---|---|---|---|
Variable | Contrast | Raw | Permutation |
Liver | CTRL vs. 4PPM | 0.2828 | 0.9610 |
Liver | CTRL vs. 8PPM | 0.3069 | 0.9670 |
Liver | CTRL vs. 16PPM | 0.7102 | 1.0000 |
Liver | CTRL vs. 50PPM | 0.7718 | 1.0000 |
Lung | CTRL vs. 4PPM | 0.7818 | 1.0000 |
Lung | CTRL vs. 8PPM | 0.8858 | 1.0000 |
Lung | CTRL vs. 16PPM | 0.5469 | 0.9990 |
Lung | CTRL vs. 50PPM | 0.8498 | 1.0000 |
Lymph | CTRL vs. 4PPM | 0.2423 | 0.9280 |
Lymph | CTRL vs. 8PPM | 0.5898 | 1.0000 |
Lymph | CTRL vs. 16PPM | 0.0350 | 0.2680 |
Lymph | CTRL vs. 50PPM | 0.4161 | 0.9930 |
Cardio | CTRL vs. 4PPM | 0.3163 | 0.9710 |
Cardio | CTRL vs. 8PPM | 0.0525 | 0.3710 |
Cardio | CTRL vs. 16PPM | 0.4506 | 0.9960 |
Cardio | CTRL vs. 50PPM | 0.7576 | 1.0000 |
Pitui | CTRL vs. 4PPM | 0.1250 | 0.7540 |
Pitui | CTRL vs. 8PPM | 0.4948 | 0.9970 |
Pitui | CTRL vs. 16PPM | 0.2157 | 0.9080 |
Pitui | CTRL vs. 50PPM | 0.5051 | 0.9970 |
Ovary | CTRL vs. 4PPM | 0.9437 | 1.0000 |
Ovary | CTRL vs. 8PPM | 0.8126 | 1.0000 |
Ovary | CTRL vs. 16PPM | 0.7760 | 1.0000 |
Ovary | CTRL vs. 50PPM | 0.3689 | 0.9930 |
As noted by Brown and Fears, only one of the 24 tests is significant at the 5% level (Lymph, CTRL vs. 16PPM). Brown and Fears
report a 12% chance of observing at least one significant raw p-value for 16PPM and a 9% chance of observing at least one significant raw p-value for Lymph
(both at the 5% level). Adjusted p-values exhibit much lower chances of false significances. For this example, none of the adjusted p-values are close to significant.
The OUT= data set is displayed in Output 67.4.4.
Output 67.4.4: OUT= Data Set
Obs | _test_ | _var_ | _contrast_ | _xval_ | _mval_ | _yval_ | _nval_ | raw_p | perm_p | sim_se |
---|---|---|---|---|---|---|---|---|---|---|
1 | FISHER | Liver | CTRL vs. 4PPM | 14 | 49 | 18 | 50 | 0.28282 | 0.961 | 0.006122 |
2 | FISHER | Liver | CTRL vs. 8PPM | 14 | 49 | 17 | 48 | 0.30688 | 0.967 | 0.005649 |
3 | FISHER | Liver | CTRL vs. 16PPM | 14 | 49 | 11 | 43 | 0.71022 | 1.000 | 0.000000 |
4 | FISHER | Liver | CTRL vs. 50PPM | 14 | 49 | 12 | 50 | 0.77175 | 1.000 | 0.000000 |
5 | FISHER | Lung | CTRL vs. 4PPM | 12 | 49 | 10 | 50 | 0.78180 | 1.000 | 0.000000 |
6 | FISHER | Lung | CTRL vs. 8PPM | 12 | 49 | 8 | 48 | 0.88581 | 1.000 | 0.000000 |
7 | FISHER | Lung | CTRL vs. 16PPM | 12 | 49 | 11 | 43 | 0.54685 | 0.999 | 0.000999 |
8 | FISHER | Lung | CTRL vs. 50PPM | 12 | 49 | 9 | 50 | 0.84978 | 1.000 | 0.000000 |
9 | FISHER | Lymph | CTRL vs. 4PPM | 8 | 49 | 12 | 50 | 0.24228 | 0.928 | 0.008174 |
10 | FISHER | Lymph | CTRL vs. 8PPM | 8 | 49 | 8 | 48 | 0.58977 | 1.000 | 0.000000 |
11 | FISHER | Lymph | CTRL vs. 16PPM | 8 | 49 | 15 | 43 | 0.03498 | 0.268 | 0.014006 |
12 | FISHER | Lymph | CTRL vs. 50PPM | 8 | 49 | 10 | 50 | 0.41607 | 0.993 | 0.002636 |
13 | FISHER | Cardio | CTRL vs. 4PPM | 1 | 49 | 3 | 50 | 0.31631 | 0.971 | 0.005307 |
14 | FISHER | Cardio | CTRL vs. 8PPM | 1 | 49 | 6 | 48 | 0.05254 | 0.371 | 0.015276 |
15 | FISHER | Cardio | CTRL vs. 16PPM | 1 | 49 | 2 | 43 | 0.45061 | 0.996 | 0.001996 |
16 | FISHER | Cardio | CTRL vs. 50PPM | 1 | 49 | 1 | 50 | 0.75758 | 1.000 | 0.000000 |
17 | FISHER | Pitui | CTRL vs. 4PPM | 0 | 49 | 3 | 50 | 0.12496 | 0.754 | 0.013619 |
18 | FISHER | Pitui | CTRL vs. 8PPM | 0 | 49 | 1 | 48 | 0.49485 | 0.997 | 0.001729 |
19 | FISHER | Pitui | CTRL vs. 16PPM | 0 | 49 | 2 | 43 | 0.21572 | 0.908 | 0.009140 |
20 | FISHER | Pitui | CTRL vs. 50PPM | 0 | 49 | 1 | 50 | 0.50505 | 0.997 | 0.001729 |
21 | FISHER | Ovary | CTRL vs. 4PPM | 3 | 49 | 1 | 50 | 0.94372 | 1.000 | 0.000000 |
22 | FISHER | Ovary | CTRL vs. 8PPM | 3 | 49 | 2 | 48 | 0.81260 | 1.000 | 0.000000 |
23 | FISHER | Ovary | CTRL vs. 16PPM | 3 | 49 | 2 | 43 | 0.77596 | 1.000 | 0.000000 |
24 | FISHER | Ovary | CTRL vs. 50PPM | 3 | 49 | 5 | 50 | 0.36889 | 0.993 | 0.002636 |
The _test_
, _var_
, and _contrast_
variables provide the TEST
name, TEST
variable, and CONTRAST
label, respectively. The _xval_
, _mval_
, _yval_
, and _nval_
variables contain the components used to compute the Fisher exact tests from the hypergeometric distribution. The raw_p
variable contains the p-values from the Fisher exact tests, and the perm_p
variable contains their permutation-based adjustments. The variable sim_se
is the simulation standard error from the permutation resampling.