CALL PRXPOSN Routine

Returns the start position and length for a capture buffer.

Category: Character String Matching
Restriction: Use with the PRXPARSE function.

Syntax

Required Arguments

regular-expression-id

specifies a numeric variable with a value that is a pattern identifier that is returned by the PRXPARSE function.

capture-buffer

is a numeric constant, variable, or expression with a value that identifies the capture buffer from which to retrieve the start position and length:

  • If the value of capture-buffer is zero, CALL PRXPOSN returns the start position and length of the entire match.
  • If the value of capture-buffer is between 1 and the number of open parentheses, CALL PRXPOSN returns the start position and length for that capture buffer.
  • If the value of capture-buffer is greater than the number of open parentheses, CALL PRXPOSN returns missing values for the start position and length.

start

is a numeric variable with a returned value that is the position at which the capture buffer is found:

  • If the value of capture-buffer is not found, CALL PRXPOSN returns a zero value for the start position.
  • If the value of capture-buffer is greater than the number of open parentheses in the pattern, CALL PRXPOSN returns a missing value for the start position.

Optional Argument

length

is a numeric variable with a returned value that is the pattern length of the previous pattern match:

  • If the pattern match is not found, CALL PRXPOSN returns a zero value for the length.
  • If the value of capture-buffer is greater than the number of open parentheses in the pattern, CALL PRXPOSN returns a missing value for length.

Details

The CALL PRXPOSN routine uses the results of PRXMATCH, PRXSUBSTR, PRXCHANGE, or PRXNEXT to return a capture buffer. A match must be found by one of these functions for the CALL PRXPOSN routine to return meaningful information.
A capture buffer is part of a match, enclosed in parentheses, that is specified in a regular expression. CALL PRXPOSN does not return the text for the capture buffer directly. It requires a call to the SUBSTR function to return the text.
For more information about pattern matching, see Pattern Matching Using Perl Regular Expressions (PRX).

Comparisons

The CALL PRXPOSN routine is similar to the PRXPOSN function, except that CALL PRXPOSN returns the position and length of the capture buffer rather than the capture buffer itself.
The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in SAS Functions and CALL Routines by Category.

Examples

Example 1: Finding Submatches within a Match

The following example searches a regular expression and calls the PRXPOSN routine to find the position and length of three submatches.
data _null_;
   patternID = prxparse('/(\d\d):(\d\d)(am|pm)/'); 
   text = 'The time is 09:56am.';
   if prxmatch(patternID, text) then do;
      call prxposn(patternID, 1, position, length);
      hour = substr(text, position, length);
      call prxposn(patternID, 2, position, length);
      minute = substr(text, position, length);
      call prxposn(patternID, 3, position, length);
      ampm = substr(text, position, length);
      put hour= minute= ampm=;
      put text=;
   end;
run;
SAS writes the following lines to the log:
hour=09 minute=56 ampm=am
text=The time is 09:56am.

Example 2: Parsing Time Data

The following example parses time data and writes the results to the SAS log.
data _null_;
   if _N_ = 1 then
   do;
      retain patternID;
      pattern = "/(\d+):(\d\d)(?:\.(\d+))?/";
      patternID = prxparse(pattern);
   end;
  
   array match[3] $ 8;
   input minsec $80.;
   position = prxmatch(patternID, minsec);
   if position ^= 0 then
   do;
      do i = 1 to prxparen(patternID);
         call prxposn(patternID, i, start, length);
         if start ^= 0 then
            match[i] = substr(minsec, start, length);
      end;
      put match[1] "minutes, " match[2] "seconds" @;
      if ^missing(match[3]) then
         put ", " match[3] "milliseconds";
   end;
   datalines;
14:56.456
45:32
;
SAS writes the following lines to the log:
   14 minutes, 56 seconds, 456 milliseconds
   45 minutes, 32 seconds