You must also consider whether you want to drop, keep, or rename the
variable before it is read into the program data vector or as it is
written to the new SAS data set. If you use the DROP, KEEP, or RENAME
statement, the action always occurs as the variables are written to
the output data set. With SAS data set options, where you use the
option determines when the action occurs. If the option is used on
an input data set, the variable is dropped, kept, or renamed before
it is read into the program data vector. If used on an output data
set, the data set option is applied as the variable is written to
the new SAS data set. (In the DATA step, an input data set is one
that is specified in a SET, MERGE, or UPDATE statement. An output
data set is one that is specified in the DATA statement.) Consider
the following facts when you make your decision:
-
If variables are not written to
the output data set and they do not require any processing, using
an input data set option to exclude them from the DATA step is more
efficient.
-
If you want to rename a variable
before processing it in a DATA step, you must use the RENAME= data
set option in the input data set.
-
If the action applies to output
data sets, you can use either a statement or a data set option in
the output data set.
The following table
summarizes the action of data set options and statements when they
are specified for input and output data sets. The last column of
the table tells whether the variable is available for processing in
the DATA step. If you want to rename the variable, use the information
in the last column.
Status of Variables and Variable Names When Dropping, Keeping,
and Renaming Variables
|
Data Set Option or Statement
|
|
Status of Variable or
Variable Name
|
|
|
includes or excludes
variables from processing
|
if excluded, variables
are not available for use in DATA step
|
|
|
changes name of variable
before processing
|
use new name in program
statements and output data set options; use old name in other input
data set options
|
|
|
specifies which variables
are written to all output data sets
|
all variables available
for processing
|
|
|
changes name of variables
in all output data sets
|
use old name in program
statements; use new name in output data set options
|
|
|
specifies which variables
are written to individual output data sets
|
all variables are available
for processing
|
|
|
changes name of variables
in individual output data sets
|
use old name in program
statements and other output data set options
|