Because most work done
by the processes in a server's SAS session involves I/O activity,
those processes can spend a significant amount of time waiting for
I/O activity to complete. (This time includes moving the head of a
disk drive to the correct position, waiting for the disk to spin around
to the position of the requested data, and transferring the data from
the disk to the computer's working storage.) In the current release
of
SAS/SHARE software, while
a process in a server's session waits for I/O activity to complete,
other processes in the server's session do not perform other work
that uses a different (CPU, memory, or messages) resource.
That waiting could,
it would seem, become a bottleneck for a server, and in a few situations
this problem is realized. But in practice most of a server's memory
is used for I/O buffers and processes in a server's session usually
satisfy most requests for data from I/O buffers that are already in
memory.
A server usually allocates
memory for one page of a file each time the file is opened, up to
the number of pages in the file. For example, if the application being
executed by a user opens a file twice, enough of the server's memory
to contain two pages of the file is allocated; if ten users run the
application, space for 20 pages of the file is allocated in the server's
memory. The number of buffers allocated for a file will not exceed
the number of pages in the file.
Of course, the pages
of the file maintained in memory are not the same set of pages all
the time: as users request pages of the file that are not in memory,
pages that are in memory are written back to the file on disk if they
have been modified, or if an in-memory page has not been modified
its buffer is simply used to read the new page.
A larger page size can
reduce the number of I/O operations required to process a SAS data
file. But it takes longer to read a large page than it takes to read
a small one, so unless most of the observations in a large page are
likely to be accessed by users, large page sizes can increase the
amount of time required to perform I/O activity in the server's SAS
session.
There are two patterns
in which data is read from or written to SAS files:
When an application
processes a SAS file in sequential order, no page of the file is read
into or written from the server's memory more than once each time
the file is read or written. Also, observations are transmitted to
and from users' sessions in groups, which conserves the messages resource.
In many applications
that are used with concurrently accessed files, data is accessed in
random order, that is, a user reads the 250th observation, then the
10,000th observation, then the 5th observation, and so forth. When
a file is processed in random order, it is much more difficult to
predict how many times each page of the file will be read into or
written from the memory of a server's SAS session. In addition, only
one observation is transmitted on each message between server and
user, which does not conserve the messages resource.
The "Programming Techniques"
section of this paper offers ideas for reducing the I/O load of a
server under the following topics:
-
“Clean Up Your Data Files”
-
“Choose the Appropriate
Subsetting Strategy”
-
“Choose Page Size Wisely”
-
“Specify Sequential Access
When an SCL Program Doesn't Need Random Access”