The Sashelp.LeuTrain
and Sashelp.LeuTest
data sets provide microarray data from Zou and Hastie (2005). The Sashelp.LeuTrain
data set consists of 7129 genes and 38 training samples, and the Sashelp.LeuTest
data set consists of the same 7129 genes and 34 testing samples. Among the 38 training samples, 27 are type 1 leukemia (acute
lymphoblastic leukemia, coded in the data as 1) and 11 are type 2 leukemia (acute myeloid leukemia, coded in the data as –1).
The following steps display information about Sashelp.LeuTrain
data set and create Figure B.11:
title 'Leukemia Training Data'; proc contents data=sashelp.LeuTrain varnum; ods select position; run; title 'The First Five Observations and Eleven Variables'; proc print data=sashelp.LeuTrain(obs=5); var y x1-x10; run; title 'Leukemia Type Variable'; proc freq data=sashelp.LeuTrain; tables y; run;
Figure B.11: Leukemia Training Data
The First Five Observations and Eleven Variables |
Obs | y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | -1.46240 | -0.64514 | -0.83593 | -1.47040 | -0.91997 | -1.58430 | 0.71239 | -0.54229 | 1.05090 | 0.23649 |
2 | 1 | -0.66480 | 0.20615 | -0.36857 | 0.25822 | -0.47567 | -0.35497 | -1.11940 | -0.29251 | -0.37542 | -0.38760 |
3 | 1 | -0.20049 | 0.37994 | -2.38280 | 0.43960 | -1.22700 | -1.76220 | 0.10464 | -1.80750 | 0.49292 | -1.67000 |
4 | 1 | -0.25776 | 0.27994 | 1.83920 | -1.62950 | -1.28750 | -1.26510 | 0.76334 | -0.61645 | -0.31578 | -0.32193 |
5 | 1 | -0.56457 | -0.39588 | -0.98372 | -0.83741 | -0.41477 | 0.14834 | -0.03550 | -0.10022 | -0.75753 | 0.37068 |
Leukemia Type Variable |
y | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
-1 | 11 | 28.95 | 11 | 28.95 |
1 | 27 | 71.05 | 38 | 100.00 |
The results of the PROC CONTENTS step are not displayed. The results show that there are 7130 variables, y
and x1
-
x7129
.
The following steps display information about Sashelp.LeuTest
data set and create Figure B.12:
title 'Leukemia Test Data'; proc contents data=sashelp.LeuTest varnum; ods select position; run; title 'The First Five Observations and Eleven Variables'; proc print data=sashelp.LeuTest(obs=5); var y x1-x10; run; title 'Leukemia Type Variable'; proc freq data=sashelp.LeuTest; tables y; run;
Figure B.12: Leukemia Test Data
The First Five Observations and Eleven Variables |
Obs | y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | -1.38240 | 0.06288 | 0.62252 | 1.61210 | 0.52179 | 0.11516 | -1.85270 | -0.39956 | 0.88007 | -0.86565 |
2 | 1 | 0.65192 | -0.35476 | 2.29630 | 1.64980 | 0.50211 | -0.37315 | 1.76820 | -1.74270 | 1.63080 | 0.60171 |
3 | 1 | 0.65409 | 1.41340 | 0.22593 | -0.06719 | 0.30015 | 0.76964 | -0.26212 | 0.94481 | -0.51884 | -0.60999 |
4 | 1 | 1.07220 | 0.01959 | 0.16875 | 0.84779 | 0.24533 | 0.79682 | 0.41442 | 0.35122 | -0.70177 | 1.85410 |
5 | 1 | 2.12480 | 1.66370 | -0.35986 | 1.15850 | 0.89379 | 0.56310 | -0.92476 | 0.56790 | -0.56039 | -2.12400 |
Leukemia Type Variable |
y | Frequency | Percent | Cumulative Frequency |
Cumulative Percent |
---|---|---|---|---|
-1 | 14 | 41.18 | 14 | 41.18 |
1 | 20 | 58.82 | 34 | 100.00 |
The results of the PROC CONTENTS step are not displayed. The results show that there are 7130 variables, y
and x1
-
x7129
.