Pipeline
parallelism occurs when the execution of Task A and Task B have interdependencies.
For example, a SAS DATA step might be followed by a PROC SORT of
the data set that is created by the DATA step. PROC SORT is dependent
on the execution of the DATA step, because the output of the DATA
step is the input needed by PROC SORT. However, the execution of
the two steps can be overlapped, and the DATA step can pipe its output
into PROC SORT. The piping feature of MP CONNECT provides pipeline
parallelism.
Piping enables you to
overlap the execution of SAS DATA steps and some SAS procedures. This
is accomplished by starting one SAS session to run one DATA step or
SAS procedure and piping its output through a TCP/IP socket as input
into another SAS session that is running another DATA step or SAS
procedure. This pipeline can be extended to include multiple steps
and can be extended between different physical computers. Piping
improves performance not only because it enables overlapped task execution,
but also because intermediate I/O is directed to a TCP/IP pipe instead
of written to disk by one task and then read from disk by the next
task.