The TRANSFORM statement lists the variables to be analyzed (variables) and specifies the transformation (transform) to apply to each variable listed. You must specify a transformation for each variable list in the TRANSFORM statement. The variables are variables in the data set. The t-options are transformation options that provide details for the transformation; these depend on the transform chosen. The t-options are listed after a slash in the parentheses that enclose the variables.
For example, the following statements find a quadratic polynomial transformation of all variables in the data set:
proc prinqual; transform spline(_all_ / degree=2); run;
Or, if N1
through N10
are nominal variables and M1
through M10
are ordinal variables, you can use the following statements:
proc prinqual; transform opscore(N1-N10) monotone(M1-M10); run;
The following sections describe the transformations available (specified with transform) and the options available for some of the transformations (specified with t-options).
There are three types of transformation families: nonoptimal, optimal, and other. The families are described as follows:
preprocess the specified variables, replacing each one with a single new nonoptimal, nonlinear transformation.
replace the specified variables with new, iteratively derived optimal transformation variables that fit the specified model better than the original variable (except in contrived cases where the transformation fits the model exactly as well as the original variable).
are the IDENTITY and SSPLINE transformations. These do not fit into either of the preceding categories.
Table 74.2 summarizes the transformations in each family.
Table 74.2: Transformation Families
Transformation |
Description |
---|---|
Nonoptimal Transformations |
|
Inverse trigonometric sine |
|
Exponential |
|
Logarithm |
|
Logit |
|
Raises variables to specified power |
|
Transforms to ranks |
|
Optimal Transformations |
|
Linear |
|
Monotonic, ties preserved |
|
Monotonic B-spline |
|
Optimal scoring |
|
B-spline |
|
Monotonic, ties not preserved |
|
Other Transformations |
|
Identity, no transformation |
|
Iterative smoothing spline |
The transform is followed by a variable (or list of variables) enclosed in parentheses. Optionally, depending on the transform, the parentheses can also contain t-options, which follow the variables and a slash. For example, the following statement computes the LOG transformation of X
and Y
:
transform log(X Y);
A more complex example follows:
transform spline(Y / nknots=2) log(X1 X2 X3);
The preceding statement uses the SPLINE transformation of the variable Y
and the LOG transformation of the variables X1
, X2
, and X3
. In addition, it uses the NKNOTS= option with the SPLINE transformation and specifies two knots.
The rest of this section provides syntax details for members of the three families of transformations. The t-options are discussed in the section Transformation Options (t-options).
Nonoptimal transformations are computed before the iterative algorithm begins. Nonoptimal transformations create a single new transformed variable that replaces the original variable. The new variable is not transformed by the subsequent iterative algorithms (except for a possible linear transformation and missing value estimation).
The following list provides syntax and details for nonoptimal variable transformations.
Optimal transformations are iteratively derived. Missing values for these types of variables can be optimally estimated (see the section Missing Values). See the sections OPSCORE, MONOTONE, UNTIE, and LINEAR Transformations and SPLINE and MSPLINE Transformations in Chapter 97: The TRANSREG Procedure, for more information about the optimal transformations.
The following list provides syntax and details for optimal transformations.
If you use a nonoptimal, optimal, or other transformation, you can use t-options, which specify additional details of the transformation. The t-options are specified within the parentheses that enclose variables and are listed after a slash. For example:
proc prinqual; transform spline(X Y / nknots=3); run;
The preceding statements find an optimal variable transformation (SPLINE) of the variables X
and Y
and use a t-option to specify the number of knots (NKNOTS=). The following is a more complex example:
proc prinqual; transform spline(Y / nknots=3) spline(X1 X2 / nknots=6); run;
These statements use the SPLINE transformation for all three variables and use t-options as well; the NKNOTS= option specifies the number of knots for the spline.
The following sections discuss the t-options available for nonoptimal, optimal, and other transformations.
Table 74.3 summarizes the t-options.
Table 74.3: Transformation Options
Option |
Description |
---|---|
Nonoptimal Transformation |
|
Uses original mean and variance |
|
Parameter Specification |
|
Specifies miscellaneous parameters |
|
Specifies smoothing parameter |
|
Spline |
|
Specifies the degree of the spline |
|
Spaces the knots evenly |
|
Specifies the interior knots or break points |
|
Creates n knots |
|
Other t-options |
|
Renames variables |
|
Reflects the variable around the mean |
|
Specifies transformation standardization |
The following t-options are available with the SPLINE and MSPLINE optimal transformations.
The following t-options are available for all transformations.