TRANSPOSE Procedure
Example 6: Transposing Data for Statistical Analysis
Features: |
COPY statement
VAR statement
|
This example arranges
data to make it suitable for either a multivariate or a univariate
repeated-measures analysis.
The data is from Chapter
8, “Repeated-Measures Analysis of Variance,” in SAS
System for Linear Models, Third Edition.
Program 1
options nodate pageno=1 linesize=80 pagesize=40;
data weights;
input Program $ s1-s7;
datalines;
CONT 85 85 86 85 87 86 87
CONT 80 79 79 78 78 79 78
CONT 78 77 77 77 76 76 77
CONT 84 84 85 84 83 84 85
CONT 80 81 80 80 79 79 80
RI 79 79 79 80 80 78 80
RI 83 83 85 85 86 87 87
RI 81 83 82 82 83 83 82
RI 81 81 81 82 82 83 81
RI 80 81 82 82 82 84 86
WI 84 85 84 83 83 83 84
WI 74 75 75 76 75 76 76
WI 83 84 82 81 83 83 82
WI 86 87 87 87 87 87 86
WI 82 83 84 85 84 85 86
;
data split;
set weights;
array s{7} s1-s7;
Subject + 1;
do Time=1 to 7;
Strength=s{time};
output;
end;
drop s1-s7;
run;
proc print data=split(obs=15) noobs;
title 'SPLIT Data Set';
title2 'First 15 Observations Only';
run;
Program Description
Set the SAS system options. The
NODATE option suppresses the display of the date and time in the output.
PAGENO= specifies the starting page number. LINESIZE= specifies the
output line length, and PAGESIZE= specifies the number of lines on
an output page.
options nodate pageno=1 linesize=80 pagesize=40;
Create the WEIGHTS data set. The
data in WEIGHTS represents the results of an exercise therapy study
of three weight-lifting programs: CONT is a control group, RI is a
program in which the number of repetitions is increased, and WI is
a program in which the weight is increased.
data weights;
input Program $ s1-s7;
datalines;
CONT 85 85 86 85 87 86 87
CONT 80 79 79 78 78 79 78
CONT 78 77 77 77 76 76 77
CONT 84 84 85 84 83 84 85
CONT 80 81 80 80 79 79 80
RI 79 79 79 80 80 78 80
RI 83 83 85 85 86 87 87
RI 81 83 82 82 83 83 82
RI 81 81 81 82 82 83 81
RI 80 81 82 82 82 84 86
WI 84 85 84 83 83 83 84
WI 74 75 75 76 75 76 76
WI 83 84 82 81 83 83 82
WI 86 87 87 87 87 87 86
WI 82 83 84 85 84 85 86
;
Create the SPLIT data set. This
DATA step rearranges WEIGHTS to create the data set SPLIT. The DATA
step transposes the strength values and creates two new variables:
Time and Subject. SPLIT contains one observation for each repeated
measure. SPLIT can be used in a PROC GLM step for a univariate repeated-measures
analysis.
data split;
set weights;
array s{7} s1-s7;
Subject + 1;
do Time=1 to 7;
Strength=s{time};
output;
end;
drop s1-s7;
run;
Print the SPLIT data set. The
NOOBS options suppresses the printing of observation numbers. The
OBS= data set option limits the printing to the first 15 observations.
SPLIT has 105 observations.
proc print data=split(obs=15) noobs;
title 'SPLIT Data Set';
title2 'First 15 Observations Only';
run;
Output 1
Split Data Set
SPLIT Data Set 1
First 15 Observations Only
Program Subject Time Strength
CONT 1 1 85
CONT 1 2 85
CONT 1 3 86
CONT 1 4 85
CONT 1 5 87
CONT 1 6 86
CONT 1 7 87
CONT 2 1 80
CONT 2 2 79
CONT 2 3 79
CONT 2 4 78
CONT 2 5 78
CONT 2 6 79
CONT 2 7 78
CONT 3 1 78
Program 2
options nodate pageno=1 linesize=80 pagesize=40;
proc transpose data=split out=totsplit prefix=Str;
by program subject;
copy time strength;
var strength;
run;
proc print data=totsplit(obs=15) noobs;
title 'TOTSPLIT Data Set';
title2 'First 15 Observations Only';
run;
Program Description
Set the SAS system options.
options nodate pageno=1 linesize=80 pagesize=40;
Transpose the SPLIT data set. PROC
TRANSPOSE transposes SPLIT to create TOTSPLIT. The TOTSPLIT data set
contains the same variables as SPLIT and a variable for each strength
measurement (Str1-Str7). TOTSPLIT can be used for either a multivariate
repeated-measures analysis or a univariate repeated-measures analysis.
proc transpose data=split out=totsplit prefix=Str;
Organize the output data set into BY groups, and populate
each BY group with untransposed values.The
variables in the BY and COPY statements are not transposed. TOTSPLIT
contains the variables Program, Subject, Time, and Strength with the
same values that are in SPLIT. The BY statement creates the first
observation in each BY group, which contains the transposed values
of Strength. The COPY statement creates the other observations in
each BY group by copying the values of Time and Strength without transposing
them.
by program subject;
copy time strength;
Specify the variable to transpose. The VAR statement specifies the Strength variable
as the only variable to be transposed.
Print the TOTSPLIT data set. The
NOOBS options suppresses the printing of observation numbers. The
OBS= data set option limits the printing to the first 15 observations.
SPLIT has 105 observations.
proc print data=totsplit(obs=15) noobs;
title 'TOTSPLIT Data Set';
title2 'First 15 Observations Only';
run;
Output 2
In the following output,
the variables in TOTSPLIT with missing values are used only in a multivariate
repeated-measures analysis. The missing values do not preclude this
data set from being used in a repeated-measures analysis because the
MODEL statement in PROC GLM ignores observations with missing values.
TOTSPLIT Data Set
TOTSPLIT Data Set 1
First 15 Observations Only
Program Subject Time Strength _NAME_ Str1 Str2 Str3 Str4 Str5 Str6 Str7
CONT 1 1 85 Strength 85 85 86 85 87 86 87
CONT 1 2 85 . . . . . . .
CONT 1 3 86 . . . . . . .
CONT 1 4 85 . . . . . . .
CONT 1 5 87 . . . . . . .
CONT 1 6 86 . . . . . . .
CONT 1 7 87 . . . . . . .
CONT 2 1 80 Strength 80 79 79 78 78 79 78
CONT 2 2 79 . . . . . . .
CONT 2 3 79 . . . . . . .
CONT 2 4 78 . . . . . . .
CONT 2 5 78 . . . . . . .
CONT 2 6 79 . . . . . . .
CONT 2 7 78 . . . . . . .
CONT 3 1 78 Strength 78 77 77 77 76 76 77
Copyright © SAS Institute Inc. All rights reserved.