is a method of processing
observations from one or more SAS data sets that are grouped or ordered
by values of one or more common variables. The most common use of
BY-group processing in the DATA step is to combine two or more SAS
data sets by using the BY statement with a SET, MERGE, MODIFY, or
UPDATE statement.
names a variable or
variables by which the data set is sorted or indexed. All data sets
must be ordered or indexed on the values of the BY variable if you
use the SET, MERGE, or UPDATE statements. If you use MODIFY, data
does not need to be ordered. However, your program might run more
efficiently with ordered data. All data sets that are being combined
must include one or more BY variables. The position of the BY variable
in the observations does not matter.
is the value or formatted
value of the BY variable.
includes all observations
with the same BY value. If you use more than one variable in a BY
statement, a BY group is a group of observations with the same combination
of values for these variables. Each BY group has a unique combination
of values for the variables.
FIRST.variable and LAST.variable
are variables that
SAS creates for each BY variable. SAS sets FIRST.
variable when
it is processing the first observation in a BY group, and sets LAST.
variable when
it is processing the last observation in a BY group. These assignments
enable you to take different actions, based on whether processing
is starting for a new BY group or ending for a BY group. For more
information, see
How the DATA Step Identifies BY Groups.