Reading Records from a KSDS

Three Ways for Reading Records from a KSDS

You can read KSDS records with sequential access, direct access, and a combination of both sequential and direct access. (See Access Types for KSDS Operations for more information.) The type of KSDS Read operation is specified with appropriate options in the SAS INFILE statement. Also, you must specify either the VSAMREAD or the VSAMUPDATE global SAS system option in order to read VSAM data sets.

Reading a KSDS with Sequential Access

By default, KSDS records are read in key order with sequential access. That is, they are read from the beginning to the end of the collating sequence of the key field contents. The following example shows the DATA step that you can use to read a KSDS sequentially:
data one;
   infile myksds vsam ;
   input;
   …more SAS statements…
If you specify the BACKWARD option, the data set is read backward, from the highest key to the lowest.

Reading a KSDS with Direct Access

Introduction to Reading a KSDS with Direct Access

A KSDS is read directly using the following methods:
  • keyed direct access by key, approximate key, or generic key
  • addressed direct access by RBA
  • skip sequential access, which is a combination of direct and sequential access
You cannot use both keyed direct and addressed direct access for the same data set in one DATA step.

Keyed Direct Access to a KSDS

To read a KSDS with keyed direct access, specify the key of the record that you want SAS to read. The key can be one of the following:
  • the exact key of the record
  • an approximate key that is less than or equal to the actual key of the record
  • a generic key specifying a leading portion of the key contained in records wanted
Several of the INFILE statement options that are described in SAS Options and Variables for VSAM Processing are used to retrieve KSDS records. These options are GENKEY, KEY=, KEYGE, KEYLEN=, and KEYPOS=.
Access Types for KSDS Operations
Operation
Read
(INFILE/INPUT Statements)
Write
(FILE/PUT Statements)
Read
Sequential
Does not apply
Direct by:
  • key with KEY= option
  • generic key with GENKEY and KEYLEN= options
  • alternate key
  • RBA with RBA= option
Skip sequential with SKIP and KEY= options
Add1
Sequential
Direct: specify a unique key in the PUT statement
Direct with KEY= option
Update
Sequential
Direct: prime key in the PUT statement must match the key of record read to update
Direct with:
  • KEY= option
  • RBA= option
Erase
Sequential
Direct: the record that is read is the record that is erased
Direct with:
  • KEY= option
  • RBA= option
Load
Does not apply
Sequential: in prime key order
1The INPUT statement is not required.

KEY= Option

The direct access option KEY= defines a SAS variable whose value is the key of the record that you want to read with an INPUT statement. The following is a simple example of the use of the KEY= option:
data two;
   id= '293652329';
   keyvar= id;
   infile myksds vsam key=keyvar;
   input;
   …more SAS statements…
In the example, VSAM retrieves the record with the ID value of 293652329 from the MYKSDS data set.
The KEY= option can specify a list of variables to create a key up to 256 characters in length. The key that is passed to VSAM is constructed by concatenating the variables specified; blanks are not trimmed.
Unless it is used with the GENKEY option, the key value that is passed to VSAM is either padded with blanks or truncated, as necessary, to equal the key length that is defined when the KSDS was created. (For example, if the KSDS specified a key length of 5 instead of 9 characters, the key that is in the preceding example would be truncated to 29365 and only records that match that value would be retrieved.) With the GENKEY option, SAS programs treat the value of the KEY= variable as a partial key so that length is not an issue.

KEYGE Option

You can use the KEYGE option to specify that the read retrieval is to be any record whose key is equal to or greater than the key specified by the KEY option variable. This approximate key retrieval is useful when the exact key is not known. The KEYGE option applies to all records read from the data set in that DATA step. That is, you cannot turn KEYGE on and off.
The following example retrieves the first record that either matches or is greater than the key given; in this case, it is 600000000:
data three;
   id= '600000000' ;
   keyvar= id;
   infile myksds vsam key=keyvar keyge;
   input;
  …more SAS statements…
If necessary, the value of KEYVAR is padded with blanks or truncated to equal the key length that was defined when the KSDS was created.

GENKEY Option

The GENKEY option specifies generic key processing. With the GENKEY option, SAS programs treat the value given by the KEY= variable as a partial key (the leading portion) of the record that is to be read. SAS reads only the first record that contains the matching partial key (unless you also specify skip sequential processing). Changing the value of the KEY= variable indicates another generic key retrieval request. The GENKEY option applies to all records read from the data set in that DATA step. That is, you cannot turn GENKEY on and off.
The following example retrieves the first record with a key matching the first part of the key specified by the KEY= variable, KEYVAR:
data four;
   id='578';
   keyvar=id;
   infile myksds vsam key=keyvar genkey;
   input;
   …more SAS statements…
The record that is read is the first record with 578 in its ID.
When you specify both the GENKEY and the SKIP options, SAS retrieves the first record that contains the matching partial key and then reads the following records sequentially. Access is sequential after the first record until you change the value of the KEY= variable, which indicates another direct-access, generic-key retrieval request. See Reading a KSDS with Skip Sequential Access for more information and an example of how to use both the GENKEY and SKIP options.

KEYLEN= Option

Use the KEYLEN= option with the GENKEY option to change the generic key length from one request to the next. KEYLEN= defines a SAS variable that specifies the length of the key to be compared to the keys in the data set. The variable's value is the number of generic key characters passed to VSAM. If you specify GENKEY without the KEYLEN= option, the generic key length is the KEY= variable length (or the sum of the KEY= variable lengths, if a list is specified) that is defined in the KSDS. The following example retrieves the first record that matches the first character of KEYVAR's value, which is 5:
data five;
   id='578';
   keyvar=id;
   klvar=1;
   infile myksds vsam key=keyvar genkey keylen=klvar;
   input;
   …more SAS statements…
The KEYLEN= option has another use. It can also give information about the key field length to the application program. Before the DATA step executes, SAS sets the variable that is specified by KEYLEN= to the actual (maximum) key length that is defined in the KSDS data set. This option enables KSDS keys to be read without knowing the key length in advance. Assign the initial value of the KEYLEN= variable to a different variable if you also intend to set the KEYLEN= variable for generic key processing or if you need to know and use the key-length value later in the DATA step. You might need to name the variable in a RETAIN statement if you need this initial value after the first execution of the DATA step:
data six;
   id='578';
   keyvar=id;   
   infile myksds vsam key=keyvar genkey keylen=klvar;
   retain lenkey;
   lenkey=klvar;
   put lenkey=;
   klvar=1;
   input;
   …more SAS statements…
In the example, the first two statements assign the key value of the records that are wanted to KEYVAR. The RETAIN statement captures and stores the initial value of the KEYLEN= variable into the LENKEY variable for later use as KLVAR. Then KLVAR is set to 1 for generic processing.

KEYPOS= Option

The KEYPOS= option specifies a numeric SAS variable that VSAM sets to the position of the key in KSDS records before the DATA step executes. The variable is set to the column number, not the offset, which is the column number minus 1. This option enables you to read KSDS keys without knowing their positions in advance.
data seven;
      length keyvar $9;
      infile myksds vsam keypos=kpvar;
      retain kpvar;
      input @kpvar keyvar;
      …more SAS statements…
In the example, VSAM retrieves each record of the KSDS and stores the record key position in variable KPVAR. The records' key value is read from the input buffer into character variable KEYVAR using the key position value.
It is possible to read KSDS keys without knowing either the key position or length in advance by using the KEYLEN= and the KEYPOS= options together. The SAS variables that you specify with the KEYLEN= and KEYPOS= options should not be present in any SAS data set that is used as input to the DATA step. Use an INPUT statement of the following form, where KPVAR is the KEYPOS= variable, KLVAR is the variable specified by the KEYLEN= option, and KEYVAR is a variable that contains the key. This example reads keys whose lengths are less than or equal to 2000.
infile myksds vsam key=keyvar keypos=kpvar keylen=klvar;
retain kpvar klvar;
input @kpvar keyvar $vary2000. klvar ...

Packed Decimal Data and Key Variables

You can use packed decimal data (date and time values) in a key variable if you request it in the same internal format as the VSAM data set. For a variable key, use the PUT function to produce the key in character format. For example, the following code writes the value 293652329 to the character variable KEYVAR using the packed decimal format PD5.
data dsname;
    id=293652329;
    keyvar=put(id,pd5.);
    infile myksds vsam key=keyvar;
…more SAS statements…
For a single, known key or the leading portion of the key, use a hexadecimal value in your request as follows:
data dsname;
   keyvar='5789'x;
   infile myksds vsam key=keyvar keyge;
…more SAS statements…

Keyed Direct Access by Alternate Index

If there is an alternate key index for a KSDS, you can use keyed direct access by alternate keys. The advantage of an alternate index is that you can effectively rearrange records in the data set instead of keeping copies organized in separate ways for different applications. See Keyed Direct Access with an Alternate Index for an introduction to the alternate index concept and a list of references for the topic.
The main difference between the prime key and the alternate key is that there can be many alternate keys, and they can be defined as nonunique. This means that an alternate key can point to more than one record in the base cluster. (For example, if an alternate index by course number is defined over a STUDENT data set that is organized by student ID, several students could have the same course number.) Each alternate index entry would point to several prime key records in the base cluster.
See Using Alternate Indexes for VSAM Data Sets for examples of the control language that defines an alternate index over a KSDS.

Addressed Direct Access by RBA

A KSDS can be read with addressed direct access, which means that a record is retrieved directly by its address. A record's address is relative to the beginning of the data set (relative-byte address or RBA).
To indicate addressed access to KSDS records, use the RBA= option in the INFILE statement to specify the RBA of the record that you want. The RBA= option defines a SAS variable that you set to the RBA of the logical record or control interval that is to be retrieved by an INPUT statement. The address that you specify must correspond to the beginning of a data record (logical record or control interval). Otherwise, the request causes a VSAM logical error. The RBA= variable is not added to the output data set:
data rbas;
  infile myksds vsam;
  input;
  rbanum=_RBA_;
  keep rbanum;
run;

data eight;
  set rbas;
  infile myksds vsam rba=rbanum;
  input;
  …more SAS statements…

Reading a KSDS with Skip Sequential Access

With skip sequential access, the initial record of a series is located with keyed direct access. (VSAM does not permit skip sequential addressed access.) After the first record is obtained, subsequent records are retrieved sequentially. Skip sequential processing improves performance because sequential retrieval requires less overhead and is faster than direct retrieval. Skip sequential access is also useful when you know the key of the first record that you want but do not know (or do not want to specify) the key of subsequent records.
Use the SKIP option in the INFILE statement to specify skip sequential processing. Retrieve the first record directly by specifying the key of the record that you want with the KEY= option in the INFILE statement. When you use the SKIP option, leaving the value of the KEY= variable unchanged turns off direct access and indicates that subsequent records are to be retrieved with sequential access. If you need to know the key of subsequent KSDS records, you can read it from the record itself, because the key is part of the record.
The following sample program illustrates skip sequential retrieval and generic key processing. The program reads in the generic portion of the key, reads all of the records in the KSDS data set with that generic key, and then writes them on the procedure output file. Note that the SKIP option retrieves only the first record with a key matching the KEY= variable. You must supply statements to read additional records.
When processing skip sequentially, remember that you must end the DATA step with a SET or a STOP statement. In the example program below, end-of-file sets the feedback code to 4, and the IF RC=4 clause stops the DATA step. If there is no record with the generic key specified, the FEEDBACK= variable is set to 16, a message is printed, and the next observation is processed.
data keys;
   length keyvar keyword1 $1;
   input keyvar $;
   cards;
1
5
8
;

data process;
   set keys;
   file print;
   if _n_=1 then do;
      put 'The KSDS records selected by GENKEY and SKIP are: ';
      put;
   end;

   /* Read all the records with the value of KEYVAR in the key. */
   /* Set KEY= variable for generic skip sequential processing. */
infile myksds  vsam key=keyvar genkey skip feedback=sasrc keypos=kp;
input @;

      /* Stop if end-of-file. */

   if sasrc=4 | sasrc=16 then do;
      _error_=0;
      if sasrc=4 then stop;

      /* If there is no record with this generic key, print a */
      /* message to the procedure output file, and go on to the next */
      /* observation.                                         */
   else do;
      sasrc =0;
      put 'There is no record with this generic key: ' keyvar;
   return;
   end;
 end;

      /* Retain the value of KEYVAR to compare the first word of the  */
      /* key of records read with sequential access. Initialize the   */
      /* value of KEYWORD1 to the KEYVAR value to start the loop.     */
   input @ kp keyword1 $;

      /* Sequentially read while the first word of the key matches */
      /* the value of KEYVAR. Write the records to the SAS print   */
      /* file.                                                     */
   do while (keyword1 eq keyvar);
      put _infile_;
      input @;

         /* Stop if end-of-file. */
      if sasrc=4 | sasrc=16 then do;
         _error_=0;
         if sasrc=4 then stop;

         /* If there is no record with this generic key, print a */
         /* message to the procedure output file, and go on to the next */
         /* observation.                                         */

      else do;
         sasrc=0;
         put 'There is no record with this generic key: ' keyvar;
      return;
      end;
    end;
    input @ kp keyword1 $;
  end;
run;