When SAS writes a numeric variable
to a SAS data set, it writes the number in IBM double-precision floating-point
format (as described in
SAS Language Reference: Concepts). In
this format, 8 bytes are required for storing a number in a SAS data
set with full precision. However, you can use the LENGTH statement
in the DATA step to specify that you want to store a particular numeric
variable in fewer bytes.
Using the LENGTH statement
can greatly reduce the amount of space that is required for storing
your data. For example, if you were storing a series of test scores
whose values could range from 0 to 100, you could use numeric variables
with a length of 2 bytes. This value would save 6 bytes of storage
per variable for each observation in your data set.
However, you must use
the LENGTH statement cautiously in order to avoid losing significant
data. One byte is always used to store the exponent and the sign.
The remaining bytes are used for the mantissa. When you store a numeric
variable in fewer than 8 bytes, the least significant digits of the
mantissa are truncated. If the part of the mantissa that is truncated
contains any nonzero digits, then precision is lost.
Use the LENGTH statement
only for variables whose values are always integers. Fractional numbers
lose precision if they are truncated. In addition, you must ensure
that the values of your variable are always represented exactly in
the number of bytes that you specify.
Use the following table
to determine the largest integer that can be stored in numeric variables
of various lengths:
Variable Length and Largest Exact Integer
|
Significant Digits
Retained
|
Largest Integer Represented
Exactly
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When you use the OUTREP
option of the LIBNAME statement to create a SAS data set that is written
in a data representation other than one that is native to SAS on
z/OS,
the information in the preceding table, does not apply. The largest
integer that can be represented exactly is generally smaller.
Note: No warning is issued when
the length that you specify in the LENGTH statement results in truncated
data.