FORMAT Procedure
INVALUE Statement
Creates an informat for reading and converting raw
data values.
Syntax
INVALUE <$>name <(informat-option(s))>
Summary of Optional Arguments
Control the attributes of the format
specifies a maximum length for the format.
specifies a minimum length for the format.
stores values or ranges in the order in which you
define them.
upper cases all input strings before they are compared
to ranges.
Control the attributes of the informat
specifies the default length of the format.
left-justifies all input strings before they are
compared to ranges.
Control the input template.
specifies the variable template for reading data.
Required Argument
- name
-
names the informat
that you are creating.
Restriction:A user-defined informat name cannot be the same as an
informat name that is supplied by SAS.
Requirement:The name must be a valid SAS name. A numeric informat
name can be up to 31 characters in length; a character informat name
can be up to 30 characters in length and cannot end in a number.
If you are creating a character informat, then use a dollar sign ($)
as the first character. Adding the dollar sign to the name is why
a character informat is limited to 30 characters.
Tips:Refer to the informat later by using the name followed
by a period. However, do not use a period after the informat name
in the INVALUE statement.
When SAS prints messages that refer to a user-written
informat, the name is prefixed by an at sign (@). When the informat
is stored, the at sign is prefixed to the name that you specify for
the informat. The addition of the at sign to the name is why the name
is limited to 31 or 30 characters. You need to use the at sign only when
you are using the name in an EXCLUDE or SELECT statement; do not prefix
the name with an at sign when you are associating the informat with
a variable.
Optional Arguments
- DEFAULT=length
- specifies the default length of the format. The value for DEFAULT=
becomes the length of the informat if you do not give a specific length
when you associate the informat with a variable.
The default length
of an informat depends on whether the informat is character or numeric.
The default length of character informats is the length of the longest
informatted value. The default of a numeric informat is 12 if you
have numeric data to the left of the equal sign. If you have a quoted
string to the left of the equal sign, then the default length is the
length of the longest string.
Tip:As a best practice, if you specify an existing informat
in a value-range set, always specify the DEFAULT= option.
- JUST
- left-justifies all input strings before they are
compared to ranges.
- MAX=length
- specifies a maximum
length for the informat or format. When you associate the format with
a variable, you cannot specify a width greater than the MAX= value.
- MIN=length
- specifies a minimum
length for the informat or format.
- NOTSORTED
- stores values or ranges
for informats or formats in the order in which you define them. If
you do not specify NOTSORTED, then values or ranges are stored in
sorted order by default, and SAS uses a binary searching algorithm
to locate the range that a particular value falls into. If you specify
NOTSORTED, then SAS searches each range in the order in which you
define them until a match is found.
Use NOTSORTED if one
of the following is true:
-
You know the likelihood of certain
ranges occurring, and you want your informat or format to search those
ranges first to save processing time.
-
You want to preserve the order
that you define ranges when you print a description of the informat
or format using the FMTLIB option.
-
You want to preserve the order
that you define ranges when you use the ORDER=DATA option and the
PRELOADFMT option to analyze class variables in PROC MEANS, PROC
SUMMARY, or PROC TABULATE.
Do not use NOTSORTED
if the distribution of values is uniform or unknown, or if the number
of values is relatively small. The binary searching algorithm that
SAS uses when NOTSORTED is not specified optimizes the performance
of the search under these conditions.
SAS automatically sets
the NOTSORTED option when you use the CPORT and the CIMPORT procedures
to transport informats or formats between operating environments with
different standard collating sequences. This automatic setting of
NOTSORTED can occur when you transport informats or formats between
ASCII and EBCDIC operating environments. If this situation is undesirable,
then do the following:
-
Use the CNTLOUT= option in the
PROC FORMAT statement to create an output control data set.
-
Use the CPORT procedure to create
a transport file for the control data set.
-
Use the CIMPORT procedure in the
target operating environment to import the transport file.
-
In the target operating environment,
use PROC FORMAT with the CNTLIN= option to build the formats and informats
from the imported control data set.
- UPCASE
- converts all raw data
values to uppercase before they are compared to the possible ranges.
If you use UPCASE, then make sure the values or ranges that you specify
are in uppercase.
- value-range-set(s)
- specifies raw data
and values that the raw data will become. The value-range-set(s) can
be one or more of the following:
value-or-range-1<..., value-or-range-n>=informatted-value |
[existing-informat]
The informat converts
the raw data to the values of
informatted-value on
the right side of the equal sign.
- value-or-range
-
- informatted-value
-
is the value that you
want the raw data in value-or-range to
become. Use one of the following forms for informatted-value:
- 'character-string'
-
is a character string
up to 32,767 characters long. Typically, character-string becomes
the value of a character variable when you use the informat to convert
raw data. Use character-string for informatted-value only
when you are creating a character informat. If you omit the single
or double quotation marks around character-string,
then the INVALUE statement assumes that the quotation marks are there.
For hexadecimal literals,
you can use up to 32,767 typed characters, or up to 16,382 represented
characters at two hexadecimal characters per represented character.
- number
-
is a number that becomes
the informatted value. Typically, number becomes
the value of a numeric variable when you use the informat to convert
raw data. Use number for informatted-value when
you are creating a numeric informat. The maximum for number depends
on the host operating environment.
- _ERROR_
-
treats data values
in the designated range as invalid data. SAS assigns a missing value
to the variable, prints the data line in the SAS log, and issues a
warning message.
- _SAME_
-
prevents the informat
from converting the raw data as any other value. For example, the
following GROUP. informat converts values 01 through 20 and assigns
the numbers 1 through 20 as the result. All other values are assigned
a missing value.
invalue group 01-20= _same_
other= .;
- existing-informat
-
is an informat that
is supplied by SAS or an existing user-defined informat. The informat
that you are creating uses the existing informat to convert the raw
data that match value-or-range on
the left side of the equal sign. If you use an existing informat,
then enclose the informat name in square brackets (for example, [date9.])
or with parentheses and vertical bars (for example, (|date9.|)). Do
not enclose the name of the existing informat in single quotation
marks.
Tip:As a best practice, if you specify an existing informat
in a value-range-set, always specify a default value by using the
DEFAULT= option.
Examples
Example 1: Create a Character Informat for Raw Data Values
The $GENDER. character
informat converts the raw data values
F
and
M
to
character values
'1'
and
'2'
:
invalue $gender 'F'='1'
'M'='2';
The dollar sign prefix
indicates that the informat converts character data.
Example 2: Create Character and Numeric Values or a Range of Values
When you create numeric
informats, you can specify character strings or numbers for
value-or-range.
For example, the TRIAL. informat converts any character string that
sorts between
A
and
M
to
the number 1 and any character string that sorts between
N
and
Z
to
the number 2. The informat treats the unquoted range 1–3000
as a numeric range, which includes all numeric values between 1 and
3000:
invalue trial 'A'-'M'=1
'N'-'Z'=2
1-3000=3;
Example 3: Create an Informat Using _ERROR_ and _SAME_
The CHECK. informat
uses _ERROR_ and _SAME_ to convert values of 1 through 4 and 99.
All other values are invalid:
invalue check 1-4=_same_
99=.
other=_error_;
If you use a numeric
informat to convert character strings that do not correspond to any
values or ranges, then you receive an error message.
Copyright © SAS Institute Inc. All rights reserved.