SAS Programming Considerations

DATA Step Processing

John and Maria have concurrent access to the SAS data set FUEL in their respective sessions. While Maria is editing the data set DATALIB.FUEL in an FSEDIT window, John can use a SET, a MERGE, or an UPDATE statement in a DATA step to read DATALIB.FUEL. Although John cannot create a new version of DATALIB.FUEL, he can create other data sets or written reports.
The following program shows the effect of implicit locking when two clients access the same SAS data set at the same time:
data _null_;
   set datalib.fuel;
   file report ps=24 n=ps;
   ...
run;

data composit;
   merge datalib.fuel fuel96;
run;
If John uses a SET statement to read DATALIB.FUEL, he cannot specify the KEY= or POINT= option unless he overrides the member-level control. By default, member-level control is required when either of these options are included in a SET statement. Here's an example.
data pressure;
   set fuel (keep=fuel maxpress);
   set datalib.fuel (cntllev=rec) key=fuel;
   ...
run;
If John uses an UPDATE statement or a SET or a MERGE statement with a BY statement to read DATALIB.FUEL, he should consider specifying member-level control to ensure that the data set remains correctly ordered while his DATA step runs. Here's an example.
data composit;
   merge datalib.fuel (cntllev=mem) fuel96;
   by grade;
run;
John cannot create a new version of DATALIB.FUEL, but he can use a MODIFY statement in a DATA step to update the shared data set. Here's an example.
data datalib.fuel;
   modify datalib.fuel;
   if (grade='03N') then
       do;
           grade='3Np';
           revised=today();
           replace datalib.fuel;
       end;
run;
When John uses the preceding DATA step to update an observation that Maria is editing in her FSEDIT window, the replace operation for that observation fails because Maria has the observation locked. This failure causes the DATA step to terminate as soon as the locked observation is encountered. However, any observations that are updated before the termination retain their updated values.
For applications that update shared data by using a MODIFY statement, it is very important to include error-checking statements to prevent failure in the updating process and premature termination. The automatic variable _IORC_ includes the return codes from the read operation (performed by the MODIFY statement) and the update operations (performed by the REPLACE, OUTPUT, and REMOVE statements). The preceding DATA step would be more effective if it was written as follows:
data datalib.fuel;
   modify datalib.fuel;
   if (grade='03N') then
       if (_iorc_ = 0) then 
         /* read with lock for successful update */
           do;
             grade='3Np';
             revised=today();
             replace datalib.fuel;
           end;
       else
             put 'Observation' _n_ 
                '(fuel' fuel ') was not replaced.';
run;
The preceding DATA step checks the value of _IORC_ to determine whether a warning or an error condition occurred while reading an observation in DATALIB.FUEL. If the observation was read successfully, it can be replaced. If the observation could not be read, a message is written to the SAS log to record the failure and to identify the observation that was not updated.
To check for specific values of _IORC_, use the SYSRC macro. For example:
data datalib.fuel;
   modify datalib.fuel;
   if (grade='03N') then
       if (_iorc_ = 0) then 
          /* read with lock for successful update */
           do;
               grade='3Np';
               revised=today();
               replace datalib.fuel;
           end;
       else if (_iorc_ = %sysrc(_SWNOUPD)) then
                put 'Observation' _n_ 
                   '(fuel' fuel ') was not replaced.';
       else
                put 'Observation' _n_ 
                   '(fuel' fuel ') read with rc' _iorc_;
run;
For complete information about the MODIFY statement, see SAS Statements: Reference. For information about the SYSRC macro and _IORC_ return code checking, see SAS Macro Language: Reference.

Using Ordered Data in a Shared Environment

Many applications that use SAS data sets require the data to be stored in sorted order according to the value (or values) of one or more variables. Beginning in SAS 6, indexes can be defined for one or more variables in a SAS data file to help SAS applications maintain the order of the observations in SAS data sets. This prevents the application from having to sort the entire data set for each use. Because SAS detects if indexes are used in its processing, indexes must be carefully defined to avoid inadvertently causing less efficient SAS performance. For more information about defining indexes, see SAS Language Reference: Concepts.
Shared SAS data sets are frequently ordered according to one or more variables. Programmers who develop SAS applications that use shared, ordered data should be aware of the following ways in which shared data can be used:
  • concurrent-update applications
  • reporting applications
Concurrent-update applications usually involve several users who repeat the following type of cycle: select an observation, update data; select another observation, update that data; and so on. If these users specify a WHERE clause to move to the next observation and the variable (or variables) in the WHERE clause are indexed, indexing can improve the server's performance by minimizing the server's effort to search for each observation. Because the users' access pattern when using the concurrent-update applications is often random instead of sequential, processing with an index does not usually increase the amount of physical I/O that is performed by the server for each user.
Reporting applications frequently read the data of one or more shared data sets, capturing the data as it is at that moment, and develop a report from that data. If the application uses a BY statement to return the data in sorted order, the server's performance can vary greatly while the data is being read. The server's performance is based on multiple factors, such as whether the BY variable is indexed, and whether options are added to the BY statement that result in the index not being used.
To optimize the server's performance, the server needs to read the data in its physical, unsorted order, and then sort the data in the SAS process that is used to produce the report. You can do this by using the SORT procedure to read the data in physical order through the server, and produce a sorted data file in your library WORK. Here's an example.
proc sort data=datalib.fuel out=fuel;
   by area;
run;
Alternatively, you can use the SQL procedure to create a temporary SAS data file and sort it by using an ORDER BY clause. Here's an example.
proc sql;
create table fuel as
   select * from datalib.fuel
   order by area;
Defining more indexes than are necessary on shared SAS data sets can increase the amount of memory that a server needs. Avoid defining indexes that will not be used by your applications when they access shared data sets through a server.

Using Non-interactive SAS Applications in a Shared Environment

Shared data is sometimes maintained by SAS applications that use the batch or a non-interactive method of processing. As in interactive applications, these non-interactive applications update SAS files through a server. Non-interactive applications can be written as one or more SCL programs or as a combination of DATA steps and procedures.
Usually, it is important that no other users access any of the shared SAS files while a non-interactive application runs. To ensure uninterrupted access, use the LOCK statement or the SCL LOCK function (for SCL programs) at the beginning of your program to get exclusive access to the SAS files that your application uses. After your program has completed, be sure to release your exclusive access to these SAS files so that other users can access them.
Here is a two-step SAS program that includes a LOCK statement that opens a shared SAS data set and copies to another data set all data that has not been updated for one month. Then the program deletes the data from the original data set. The following example program gives exclusive access to a specific SAS file and clears the exclusive lock after the program has completed processing.
%libdef(datalib);

/* Try to get exclusive access to the SAS data set. */
lock datalib.fuel;

/* Did we succeed? If not, stop here. */
data _null_;
   put "SYSLCKRC=&SYSLCKRC";
   if "&SYSLCKRC" ^= '0' then
      abort return;
run;

/* Copy any observations that have not been updated in */
/* 30 days to a different, locally-accessed library.  */

data permlib.a;
   drop now;
   retain now;
   if (_N_=1) then now=today();
   set datalib.fuel;
   if (accdate<(now− 30)) then output permlib.a;
run;

/* Now delete those observations from the master file. */

proc sql;
delete from datalib.fuel where (accdate<(today()− 30));
quit;

/* Tasks completed. Release the lock on the master file. */

lock datalib.fuel clear;
z/OS Specifics: SAS does not support concurrent host sorts under the z/OS operating environment. Attempting to invoke a host sort while one is already running will cause SAS to revert to the internal sort, which might have an adverse effect on performance. Attempts to run concurrent sorts usually occur in a server environment, but running sorts in a server environment is not recommended.