The Sashelp.LeuTrain
and Sashelp.LeuTest
data sets provide microarray data from (Golub et al., 1999; Zou and Hastie, 2005). The Sashelp.LeuTrain
data set consists of 7,129 genes and 38 training samples, and the Sashelp.LeuTest
data set consists of the same 7,129 genes and 34 testing samples. Among the 38 training samples, 27 are type 1 leukemia (acute
lymphoblastic leukemia, coded in the data as 1) and 11 are type 2 leukemia (acute myeloid leukemia, coded in the data as –1).
The following steps display information about Sashelp.LeuTrain
data set and create Figure B.12:
title 'Leukemia Training Data'; proc contents data=sashelp.LeuTrain varnum; ods select position; run; title 'The First Five Observations and 11 Variables'; proc print data=sashelp.LeuTrain(obs=5); var y x1-x10; run; title 'Leukemia Type Variable'; proc freq data=sashelp.LeuTrain; tables y; run;
Figure B.12: Leukemia Training Data
The First Five Observations and 11 Variables |
Obs | y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | -1.46240 | -0.64514 | -0.83593 | -1.47040 | -0.91997 | -1.58430 | 0.71239 | -0.54229 | 1.05090 | 0.23649 |
2 | 1 | -0.66480 | 0.20615 | -0.36857 | 0.25822 | -0.47567 | -0.35497 | -1.11940 | -0.29251 | -0.37542 | -0.38760 |
3 | 1 | -0.20049 | 0.37994 | -2.38280 | 0.43960 | -1.22700 | -1.76220 | 0.10464 | -1.80750 | 0.49292 | -1.67000 |
4 | 1 | -0.25776 | 0.27994 | 1.83920 | -1.62950 | -1.28750 | -1.26510 | 0.76334 | -0.61645 | -0.31578 | -0.32193 |
5 | 1 | -0.56457 | -0.39588 | -0.98372 | -0.83741 | -0.41477 | 0.14834 | -0.03550 | -0.10022 | -0.75753 | 0.37068 |
The results of the PROC CONTENTS step are not displayed. The results show that there are 7,130 variables, y
and x1
-
x7129
.
The following steps display information about Sashelp.LeuTest
data set and create Figure B.13:
title 'Leukemia Test Data'; proc contents data=sashelp.LeuTest varnum; ods select position; run; title 'The First Five Observations and 11 Variables'; proc print data=sashelp.LeuTest(obs=5); var y x1-x10; run; title 'Leukemia Type Variable'; proc freq data=sashelp.LeuTest; tables y; run;
Figure B.13: Leukemia Test Data
The First Five Observations and 11 Variables |
Obs | y | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | -1.38240 | 0.06288 | 0.62252 | 1.61210 | 0.52179 | 0.11516 | -1.85270 | -0.39956 | 0.88007 | -0.86565 |
2 | 1 | 0.65192 | -0.35476 | 2.29630 | 1.64980 | 0.50211 | -0.37315 | 1.76820 | -1.74270 | 1.63080 | 0.60171 |
3 | 1 | 0.65409 | 1.41340 | 0.22593 | -0.06719 | 0.30015 | 0.76964 | -0.26212 | 0.94481 | -0.51884 | -0.60999 |
4 | 1 | 1.07220 | 0.01959 | 0.16875 | 0.84779 | 0.24533 | 0.79682 | 0.41442 | 0.35122 | -0.70177 | 1.85410 |
5 | 1 | 2.12480 | 1.66370 | -0.35986 | 1.15850 | 0.89379 | 0.56310 | -0.92476 | 0.56790 | -0.56039 | -2.12400 |
The results of the PROC CONTENTS step are not displayed. The results show that there are 7,130 variables, y
and x1
-
x7129
.