Transformations in a SAS Data Integration Studio job
can produce the following types of intermediate files:
-
procedure utility files that are
created by the SORT and SUMMARY procedures when these procedures are
used in the transformation
-
transformation temporary files
that are created by the transformation as it is working
-
transformation output tables that
are created by the transformation when it produces its result; the
output for a transformation becomes the input to the next transformation
in the flow
By default,
procedure utility files, transformation temporary files, and transformation
output tables are created in the WORK library. You can use the -WORK
invocation option to force all intermediate files to a specified location.
You can use the -UTILLOC invocation option to force only utility files
to a separate location.
Knowledge
of intermediate files helps you to perform the following tasks:
-
View or analyze the output tables
for a transformation and verify that the output is correct.
-
Estimate the disk space that is
needed for intermediate files.
These
intermediate files are usually deleted after they have served their
purpose. However, it is possible that some intermediate files might
be retained longer than desired in a particular process flow. For
example, some user-written transformations might not delete the temporary
files that they create.
Utility
files are deleted by the SAS procedure that created them. Transformation
temporary files are deleted by the transformation that created them.
When a SAS Data Integration Studio job is executed in batch, transformation
output tables are deleted when the process flow ends or the current
server session ends.
When a
job is executed interactively in SAS Data Integration Studio, transformation
output tables are retained until the Job Editor window is closed or
the current server session is ended in some other way (for example,
by selecting
ActionsStop from the menu. For information about
how transformation output tables can be used to debug the transformations
in a job, see
Reviewing Temporary Output Tables. However, as long as you keep the job open in the Job Editor window,
the output tables remain in the WORK library on the SAS Workspace
Server that executed the job. If this is not what you want, you can
manually delete the output tables, or you can close the Job Editor
window and open it again, which will delete all intermediate files.
Here is
a post-processing macro that can be incorporated into a process flow.
It uses the DATASETS procedure to delete all data sets in the Work
library, including any intermediate files that have been saved to
the Work library.
%macro clear_work;
%local work_members;
proc sql noprint;
select memname
into :work_members separated by ","
from dictionary.tables
where
libname = "WORK" and
memtype = "DATA";
quit;
data _null_;
work_members = symget("work_members");
num_members = input(symget("sqlobs"), best.);
do n = 1 to num_members;
this_member = scan(work_members, n, ",");
call symput("member"||trim(left(put(n,best.))),trim(this_member));
end;
call symput("num_members", trim(left(put(num_members,best.))));
run;
%if #_members gt 0 %then %do;
proc datasets library = work nolist;
%do n=1 %to #_members;
delete &&member&n
%end;
quit;
%end;
%mend clear_work;
%clear_work
Note: The previous
macro deletes all data sets in the Work library.
The transformation
output tables for a process flow remain until the SAS session that
is associated with the flow is terminated. Analyze the process flow
and determine whether there are output tables that are not being used
(especially if these tables are large). If so, you can add transformations
to the flow that deletes these output tables and free up valuable
disk space and memory. For example, you can add a generated transformation
that deletes output tables at a certain point in the flow. For details
about generated transformations, see
Creating and Using a Generated Transformation.