Compressing data reduces
I/O and disk space but increases CPU time. Therefore, whether data
compression is worthwhile to you depends on the resource cost-allocation
policy in your data center. Often your decision must be based on which
resource is more valuable or more limited, DASD space or CPU time.
You can use the portable
SAS system option COMPRESS= to compress all data sets that are created
during a SAS session. Or, use the SAS data set option COMPRESS= to
compress an individual data set. Data sets that contain many long
character variables generally are excellent candidates for compression.
The following tables
illustrate the results of compressing SAS data sets under
z/OS. In both cases, PROC COPY was used to copy
data from an uncompressed source data set into uncompressed and compressed
result data sets, using the system option values COMPRESS=NO and COMPRESS=YES,
respectively.
Note: When you use PROC COPY to
compress a data set, you must include the NOCLONE option in your PROC
statement. Otherwise, PROC COPY propagates all the attributes of the
source data set, including its compression status.
In the following tables,
the CPU row shows how much time was used by an IBM 3090-400S to copy
the data, and the SPACE values show how much storage (in megabytes)
was used.
For the first table,
the source data set was a problem-tracking data set. This data set
contained mostly long, character data values, which often contained
many trailing blanks.
Compressed Data Comparison 1
For the preceding table,
the CPU cost per megabyte is 0.1 seconds.
For the next table,
the source data set contained mostly numeric data from an MICS performance
database. The results were again good, although not as good as when
mostly character data was compressed.
Compressed Data Comparison 2
For the preceding table,
the CPU cost per megabyte is 1 second.
For more information
about compressing SAS data, see
SAS(R) Programming Tips:
A Guide to Efficient SAS(R) Processing.