/search-string/source-string/
prxchange('s/(\w+), (\w+)/$2 $1',-1, 'Jones, Fred');
s/(\w+), (\w+)/$2 $1
.
The number of times to search for a match is –1. The source
string is 'Jones, Fred'. The value –1 specifies that matching
patterns continue to be replaced until the end of the source is reached.
(XXX) XXX-XXXX
or XXX-XXX-XXXX
.
data _null_; 1 if _N_ = 1 then do; paren = "\([2-9]\d\d\) ?[2-9]\d\d-\d\d\d\d"; 2 dash = "[2-9]\d\d-[2-9]\d\d-\d\d\d\d"; 3 expression = "/(" || paren || ")|(" || dash || ")/"; 4 retain re; re = prxparse(expression); 5 if missing(re) then 6 do; putlog "ERROR: Invalid expression " expression; 7 stop; end; end; length first last home business $ 16; input first last home business; if ^prxmatch(re, home) then 8 putlog "NOTE: Invalid home phone number for " first last home; if ^prxmatch(re, business) then 9 putlog "NOTE: Invalid business phone number for " first last business; datalines; Jerome Johnson (919)319-1677 (919)846-2198 Romeo Montague 800-899-2164 360-973-6201 Imani Rashid (508)852-2146 (508)366-9821 Palinor Kent . 919-782-3199 Ruby Archuleta . . Takei Ito 7042982145 . Tom Joad 209/963/2764 2099-66-8474 ; run;
1 | Create a DATA step. | ||||||||||||
2 | Build
a Perl regular expression to identify a phone number that matches
(XXX)XXX-XXXX, and assign the variable PAREN to hold the result. Use
the following syntax elements to build the Perl regular expression:
|
||||||||||||
3 | Build a Perl regular expression to identify a phone number that matches XXX-XXX-XXXX, and assign the variable DASH to hold the result. | ||||||||||||
4 | Build
a Perl regular expression that concatenates the regular expressions
for (XXX)XXX-XXXX and XXX—XXX—XXXX. The concatenation
enables you to search for both phone number formats from one regular
expression.
The PAREN and DASH regular
expressions are placed within parentheses. The bar metacharacter (|)
that is located between PAREN and DASH instructs the compiler to match
either pattern. The slashes around the entire pattern tell the compiler
where the start and end of the regular expression is located.
|
||||||||||||
5 | Pass the Perl regular expression to PRXPARSE and compile the expression. PRXPARSE returns a value to the compiled pattern. Using the value with other Perl regular expression functions and CALL routines enables SAS to perform operations with the compiled Perl regular expression. | ||||||||||||
6 | Use the MISSING function to check whether the regular expression was successfully compiled. | ||||||||||||
7 | Use the PUTLOG statement to write an error message to the SAS log if the regular expression did not compile. | ||||||||||||
8 | Search for a valid home phone number. PRXMATCH uses the value from PRXPARSE along with the search text and returns the position where the regular expression was found in the search text. If there is no match for the home phone number, the PUTLOG statement writes a note to the SAS log. | ||||||||||||
9 | Search for a valid business phone number. PRXMATCH uses the value from PRXPARSE along with the search text and returns the position where the regular expression was found in the search text. If there is no match for the business phone number, the PUTLOG statement writes a note to the SAS log. |
NOTE: Invalid home phone number for Palinor Kent NOTE: Invalid home phone number for Ruby Archuleta NOTE: Invalid business phone number for Ruby Archuleta NOTE: Invalid home phone number for Takei Ito 7042982145 NOTE: Invalid business phone number for Takei Ito NOTE: Invalid home phone number for Tom Joad 209/963/2764 NOTE: Invalid business phone number for Tom Joad 2099-66-8474
<
,
a common substitution when converting text to HTML.
data _null_; 1 input; 2 _infile_ = prxchange('s/</</', -1, _infile_); 3 put _infile_; 4 datalines; 5 x + y < 15 x < 10 < y y < 11 ; run;
1 | Create a DATA step. |
2 | Bring an input data record into the input buffer without creating any SAS variables. |
3 | Call
the PRXCHANGE routine to perform the pattern exchange. The format
for the regular expression is s/regular-expression/replacement-text/ .
The s before the regular expression signifies
that this is a substitution regular expression. The –1 is a
special value that is passed to PRXCHANGE and indicates that all possible
replacements should be made.
|
4 | Write the current output line to the log by using the _INFILE_ option with the PUT statement. |
5 | Identify the input file. |
text_lines
,
changes the text for the column line
, and
places the results in a column named html_line
:
proc sql; select prxchange('s/</</', -1, line) as html_line from text_lines; quit;
data _null_; 1 if _N_ = 1 then do; paren = "\(([2-9]\d\d)\) ?[2-9]\d\d-\d\d\d\d"; 2 dash = "([2-9]\d\d)-[2-9]\d\d-\d\d\d\d"; 3 regexp = "/(" || paren || ")|(" || dash || ")/"; 4 retain re; re = prxparse(regexp); 5 if missing(re) then 6 do; putlog "ERROR: Invalid regexp " regexp; 7 stop; end; retain areacode_re; areacode_re = prxparse("/828|336|704|910|919|252/"); 8 if missing(areacode_re) then do; putlog "ERROR: Invalid area code regexp"; stop; end; end; length first last home business $ 25; length areacode $ 3; input first last home business; if ^prxmatch(re, home) then putlog "NOTE: Invalid home phone number for " first last home; if prxmatch(re, business) then 9 do; which_format = prxparen(re); 10 call prxposn(re, which_format, pos, len); 11 areacode = substr(business, pos, len); if prxmatch(areacode_re, areacode) then 12 put "In North Carolina: " first last business; end; else putlog "NOTE: Invalid business phone number for " first last business; datalines; Jerome Johnson (919)319-1677 (919)846-2198 Romeo Montague 800-899-2164 360-973-6201 Imani Rashid (508)852-2146 (508)366-9821 Palinor Kent 704-782-4673 704-782-3199 Ruby Archuleta 905-384-2839 905-328-3892 Takei Ito 704-298-2145 704-298-4738 Tom Joad 515-372-4829 515-389-2838 ;
1 | Create a DATA step. | ||||||||||||
2 | Build
a Perl regular expression to identify a phone number that matches
(XXX)XXX-XXXX, and assign the variable PAREN to hold the result. Use
the following syntax elements to build the Perl regular expression:
|
||||||||||||
3 | Build a Perl regular expression to identify a phone number that matches XXX-XXX-XXXX, and assign the variable DASH to hold the result. | ||||||||||||
4 | Build
a Perl regular expression that concatenates the regular expressions
for (XXX)XXX-XXXX and XXX—XXX—XXXX. The concatenation
enables you to search for both phone number formats from one regular
expression.
The PAREN and DASH regular
expressions are placed within parentheses. The bar metacharacter (|)
that is located between PAREN and DASH instructs the compiler to match
either pattern. The slashes around the entire pattern tell the compiler
where the start and end of the regular expression is located.
|
||||||||||||
5 | Pass the Perl regular expression to PRXPARSE and compile the expression. PRXPARSE returns a value to the compiled pattern. Using the value with other Perl regular expression functions and CALL routines enables SAS to perform operations with the compiled Perl regular expression. | ||||||||||||
6 | Use the MISSING function to check whether the Perl regular expression compiled without error. | ||||||||||||
7 | Use the PUTLOG statement to write an error message to the SAS log if the regular expression did not compile. | ||||||||||||
8 | Compile a Perl regular expression that searches a string for a valid North Carolina area code. | ||||||||||||
9 | Search for a valid business phone number. | ||||||||||||
10 | Use the PRXPAREN function to determine which submatch to use. PRXPAREN returns the last submatch that was matched. If an area code matches the form (XXX), PRXPAREN returns the value 2. If an area code matches the form XXX, PRXPAREN returns the value 4. | ||||||||||||
11 | Call the PRXPOSN routine to retrieve the position and length of the submatch. | ||||||||||||
12 | Use the PRXMATCH function to determine whether the area code is a valid North Carolina area code, and write the observation to the log. |
data _null_; 1 length first last phone $ 16; retain re; if _N_ = 1 then do; 2 re=prxparse("/\(([2-9]\d\d)\) ?[2-9]\d\d-\d\d\d\d/"); 3 end; input first last phone & 16.; if prxmatch(re, phone) then do; 4 area_code = prxposn(re, 1, phone); 5 if area_code ^in ("828" "336" "704" "910" "919" "252") then putlog "NOTE: Not in North Carolina: " first last phone; 6 end; datalines; 7 Thomas Archer (919)319-1677 Lucy Mallory (800)899-2164 Tom Joad (508)852-2146 Laurie Jorgensen (252)352-7583 ; run;
1 | Create a DATA step. | ||||||||||||||||||||||||||||||||||||
2 | If this is the first record, find the value of re. | ||||||||||||||||||||||||||||||||||||
3 | Build
a Perl regular expression for pattern matching. Use the following
syntax elements to build the Perl regular expression:
|
||||||||||||||||||||||||||||||||||||
4 | Return the position at which the string begins. | ||||||||||||||||||||||||||||||||||||
5 | Identify the position at which the area code begins. | ||||||||||||||||||||||||||||||||||||
6 | Search for an area code from the list. If the area code is not valid for North Carolina, use the PUTLOG statement to write a note to the SAS log. | ||||||||||||||||||||||||||||||||||||
7 | Identify the input file. |