You can concatenate USS files or directories with the
following methods:
-
associating a fileref with multiple
explicit pathnames enclosed in parentheses
-
specifying a combination of explicit
pathnames and pathname patterns enclosed in parentheses
-
using a single pathname pattern
A pathname pattern is formed by including one or more
UNIX wildcards in a partial pathname.
The parenthesis method
is specified in the FILENAME statement. You can use the wildcard method
in the FILENAME, INFILE, and %INCLUDE statements and in the INCLUDE
command. The wildcard method is for input only; you cannot use wildcards
in the FILE statement. The parenthesis method supports input and output.
However, for output, data is written to the first file in the concatenation.
That first file cannot be the result of resolving a wildcard. By
requiring the user to explicitly specify the entire pathname of the
first file, the possibility of accidentally writing to the wrong file
is greatly reduced.
The set of supported
wildcard characters are the asterisk (*), the question mark(?), the
square brackets ([]), and the backslash (\).
The asterisk wildcard
provides an automatic match to zero or more contiguous characters
in the corresponding position of the pathname except for a period
(.) at the beginning of the filename of a hidden file.
Here are some examples
that use the asterisk as a wildcard:
-
In the following FILENAME statement,
the stand-alone asterisk concatenates all of the files (in the specified
directory) except for the hidden UNIX files.
filename test '/u/userid/data/*';
-
In the following INCLUDE statement,
the leading asterisk includes all of the files (in the specified directory)
that end with
test.dat
.
include '/u/userid/data/*test.dat';
-
In the following INCLUDE statement,
the trailing asterisk includes all of the files (in the specified
directory) that begin with
test
.
include '/u/userid/data/test*';
-
In the following INCLUDE statement,
the period with a trailing asterisk selects all of the hidden UNIX
files in the specified directory.
include '/u/userid/data/.*';
-
In the following INFILE statement,
the embedded asterisk inputs all of the files (in the specified directory)
that begin with
test
and end with
file
.
infile '/u/userid/data/test*file';
The question mark wildcard
provides an automatic match for any character found in the same relative
position in the pathname. Use one or more question marks instead
of an asterisk to control the length of the matching strings.
Here are some examples
that use the question mark as a wildcard:
-
In the following FILENAME statement,
the stand-alone question mark concatenates all of the files (in the
specified directory) that have a one-character filename.
filename test '/u/userid/data/?';
-
In the following INCLUDE statement,
the leading question mark includes all of the files (in the specified
directory) that have filenames that are nine characters long and end
with
test.dat
.
include '/u/userid/data/?test.dat';
-
In the following INCLUDE statement,
the trailing question mark includes all of the files (in the specified
directory) that have filenames that are five characters long and begin
with
test
.
include '/u/userid/data/test?';
-
In the following INFILE statement,
the embedded question mark inputs all of the files (in the specified
directory) with filenames that are ten characters long, begin with
test
,
and end with
file
.
infile '/u/userid/data/test??file';
Square brackets provide
a match to all characters that are found in the list enclosed by the
brackets that appear in the corresponding relative position in the
pathname. The list can be specified as a string of characters or as
a range. A range is defined by a starting character and an ending
character separated by a hyphen (-).
The interpretation of
what is included between the starting and ending characters is controlled
by the value of the LC_COLLATE variable of the locale that is being
used by UNIX System Services. Attempting to include both uppercase
and lowercase characters, or both alphabetic characters and digits
in a range, increases the risk of unexpected results. The risk can
be minimized by creating a list with multiple ranges and limiting
each range to one of the following sets:
-
a set of lowercase characters
-
a set of uppercase characters
-
Here are some examples
of using square brackets as wildcard characters:
-
In the following FILENAME statement,
the bracketed list sets up a fileref that concatenates any files (in
the specified directory) that are named a, b, or c and that exist.
filename test '/u/userid/data/[abc]';
-
In the following INCLUDE statement,
the leading bracketed list includes all of the files (in the specified
directory) that have filenames that are nine characters long, start
with m, n, o, p, or z, and end with
test.dat
.
include '/u/userid/data/[m-pz]test.dat';
-
In the following INCLUDE statement,
the trailing bracketed list includes all files (in the specified
directory) with filenames that are five characters long, begin with
test
,
and end with a decimal digit.
include '/u/userid/data/test[0-9]';
-
In the following INCLUDE statement,
the embedded bracketed list inputs all files (in the specified directory)
with filenames that are ten characters long, begin with
test
,
followed by an upper or lower case a, b, or c, and end with
file
.
infile '/u/userid/data/test[a-cA-C]file';
The backslash is used
as an escape character. It indicates that the character that it precedes
should not be used as a wildcard.
All of the pathnames
in a concatenation must be for USS files or directories. If your program
reads data from different types of files in the same DATA step, then
you can use the EOF= option in each INFILE statement to direct program
control to a new INFILE statement after each file has been read. For
more information about the EOF= option of the INFILE statement, see
SAS Statements: Reference. A wildcard character that generates a list of mixed
file types results in an error.
Wildcards that you use
when you pipe data from SAS to USS commands are not expanded within
the SAS session, but they are passed directly to the USS commands
for interpretation.