Elements for Columns
Define the variables for the SAS data set.
Syntax
COLUMN name="name"
retain="NO | YES" class="ORDINAL | FILENAME | FILEPATH"
TYPE
DATATYPE
DEFAULT
ENUM
FORMAT width="w"
ndec="d"
INFORMAT width="w"
ndec="d"
DESCRIPTION
LENGTH
PATH syntax="type"
INCREMENT-PATH syntax="type"
beginend="BEGIN | END"
RESET-PATH syntax="type"
beginend="BEGIN | END"
DECREMENT-PATH syntax="type"
beginend="BEGIN | END"
Elements
- COLUMN name="name"
retain="NO | YES" class="ORDINAL | FILENAME | FILEPATH"
-
is an element that
contains a variable definition. For example, <COLUMN
name="Title">
.
- name="name"
-
specifies the name
for the variable. The name must be a valid SAS name, which can be
up to 32 characters.
Requirement:The name= attribute is required.
- retain="NO | YES"
-
is an optional attribute
that determines the contents of the input buffer at the beginning
of each observation.
- NO
-
sets the value for
the beginning of each observation either to MISSING or to the value
of the DEFAULT element if specified.
- YES
-
keeps the current value
until it is replaced by a new, nonmissing value. Specifying YES is
much like the RETAIN statement in DATA step processing. It forces
the retention of processed values after an observation is written
to the output SAS data set.
- class="ORDINAL | FILENAME | FILEPATH" XMLV2
Only
-
is an optional attribute
that determines the type of variable.
- ORDINAL
-
specifies that the
variable is a numeric counter variable that keeps track of the number
of times the location path, which is specified by the INCREMENT-PATH
element or the DECREMENT-PATH element, is encountered. (This is similar
to the _N_ automatic variable in DATA step processing.) The counter
variable increments or decrements its count by 1 each time the location
path is encountered. Counter variables can be useful for identifying
individual occurrences of like-named data elements or for counting
observations.
Restriction:When exporting an XML document, variables with class="ORDINAL"
are not included in the output XML document.
Requirements:You must use the INCREMENT-PATH element or the DECREMENT-PATH
element. The PATH element is not allowed.
The TYPE element must specify the SAS data type as numeric,
and the DATATYPE element must specify the type of data as integer.
- FILENAME
-
generates a character
variable that contains the filename and extension of the input document. This
functionality can be useful when you assign a libref for the XML engine
that is associated with a physical location of a SAS library to determine
which file contains a particular value.
Requirement:The TYPE element must specify the SAS data type as character,
and the DATATYPE element must specify the type of data as string.
- FILEPATH
-
generates a character
variable that contains the pathname, filename, and extension of the
input document. This functionality can be useful when you assign a
libref for the XML engine that is associated with a physical location
of a SAS library to determine which file contains a particular observation.
Requirement:The TYPE element must specify the SAS data type as character,
and the DATATYPE element must specify the type of data as string.
Requirement:At least one COLUMN element is required.
Interaction:COLUMN can contain one or more of the following
elements that describe the variable attributes: DATATYPE, DEFAULT,
ENUM, FORMAT, INFORMAT, DESCRIPTION, LENGTH, TYPE, PATH, INCREMENT-PATH,
DECREMENT-PATH, and RESET-PATH.
- TYPE
-
specifies the SAS data
type (character or numeric) for the variable, which is how SAS stores
the data. For example, <TYPE> numeric </TYPE>
specifies
that the SAS data type for the variable is numeric.
Requirement:The TYPE element is required.
Tips:To assign a floating-point type, use
<DATATYPE> float </DATATYPE>
<TYPE> numeric </TYPE>
.
To apply output formatting in SAS, use the FORMAT element.
To control data type conversion in input, use the INFORMAT
element. For example, <INFORMAT> datatime </INFORMAT>
.
- DATATYPE
-
specifies the type
of data being read from the XML document for the variable. For example, <DATATYPE>
string </DATATYPE>
specifies that the data contains
alphanumeric characters.
The type of data specification
can be
- string
-
specifies that the
data contains alphanumeric characters and does not contain numbers
used for calculations.
- integer
-
specifies that the
data contains whole numbers used for calculations.
- double
-
specifies that the
data contains floating-point numbers.
- datetime
-
specifies that the
input represents a valid datetime value, which is either
-
in the form of the XML specification
ISO 8601 format. The default form is:
yyyy-mm-ddThh:mm:ss.ffffff.
-
in a form for which a SAS informat
(either supplied by SAS or user-written) properly translates the input
into a valid SAS datetime value. See also the
INFORMAT element.
- date
-
specifies that the
input represents a valid date value, which is either
-
in the form of the XML specification
ISO 8601 format. The default form is:
yyyy-mm-dd.
-
in a form for which a SAS informat
(either supplied by SAS or user-written) properly translates the input
into a valid SAS date value.
See also the INFORMAT element.
- time
-
specifies that the
input represents a valid time value, which is either
-
in the form of the XML specification
ISO 8601 format. The default form is:
hh:mm:ss.ffffff.
-
in a form for which a SAS informat
(either supplied by SAS or user-written) properly translates the input
into a valid SAS date value.
See also the INFORMAT element.
Restriction:The values for previous versions of XMLMap syntax are
not accepted by versions 1.9 and 2.1.
Requirement:The DATATYPE element is required.
- DEFAULT
-
is an optional element
that specifies a default value for a missing value for the variable.
Use the DEFAULT element to assign a nonmissing value to missing data.
For example, <DEFAULT> single </DEFAULT>
assigns
the value single
when a missing value
occurs.
Default:By default, the XML engine sets a missing value to
MISSING.
- ENUM
-
is an optional element
that contains a list of valid values for the variable. The ENUM element
can contain one or more VALUE elements to list the values. By using
ENUM, values in the XML document are verified against the list of
values. If a value is not valid, it is either set to MISSING (by default)
or set to the value specified by the DEFAULT element. Note that a
value specified for DEFAULT must be one of the ENUM values in order
to be valid.
<COLUMN name="filing_status">
.
.
.
<DEFAULT> single </DEFAULT>
.
.
.
<ENUM>
<VALUE> single </VALUE>
<VALUE> married filing joint return </VALUE>
<VALUE> married filing separate return </VALUE>
<VALUE> head of household </VALUE>
<VALUE> qualifying widow(er) </VALUE>
</ENUM>
</COLUMN>
- FORMAT width="w"
ndec="d"
-
is an optional element
that specifies a SAS format for the variable. A format name can be
up to 31 characters for a character format and 32 characters for a
numeric format. A SAS format is an instruction that SAS uses to write
values. You use formats to control the written appearance of values.
Do not include a period (.) as part of the format name. Specify a
width and length as attributes, not as part of the format name.
- width="w"
-
is an optional attribute
that specifies a format width, which for most formats is the number
of columns in the output data.
- ndec="d"
-
is an optional attribute
that specifies a decimal scaling factor for numeric formats.
Here is an example:
<FORMAT> E8601DA </FORMAT>
<FORMAT width="8"> best </FORMAT>
<FORMAT width="8" ndec="2"> dollar </FORMAT>
- INFORMAT width="w"
ndec="d"
-
is an optional element
that specifies a SAS informat for the variable. An informat name can
be up to 30 characters for a character informat and 31 characters
for a numeric informat. A SAS informat is an instruction that SAS
uses to read values into a variable (that is, to store the values).
Do not include a period (.) as part of the informat name. Specify
a width and length as attributes, not as part of the informat name.
Here is an example:
<INFORMAT> E8601DA </INFORMAT>
<INFORMAT width="8"> best </INFORMAT>
<INFORMAT width="8" ndec="2"> dollar </INFORMAT>
- width="w"
-
is an optional attribute
that specifies an informat width, which for most informats is the
number of columns in the input data.
- ndec="d"
-
is an optional attribute
that specifies a decimal scaling factor for numeric informats. SAS
divides the input data by 10 to the power of this value.
- DESCRIPTION
-
is an optional element
that specifies a description for the variable, which can be up to
256 characters. The following example shows that the description is
assigned as the variable label.
<DESCRIPTION> Story link </DESCRIPTION>
- LENGTH
-
is the maximum field
storage length from the XML data for a character variable. The value
refers to the number of bytes used to store each of the variable's
values in the SAS data set. The value can be 1 to 32,767. During the
input process, a maximum length of characters is read from the XML
document and transferred to the observation buffer. For example, <LENGTH>
200 </LENGTH>
.
Restriction:LENGTH is not valid for numeric data.
Requirement:For data that is defined as a STRING data type, the LENGTH
element is required.
Tip:You can use LENGTH to truncate a long field.
- PATH syntax="type"
-
specifies a location
path that tells the XML engine where in the XML document to locate
and access a specific tag for the current variable. In addition, the
location path tells the XML engine to perform a function, which is
determined by the location path form, to retrieve the value for the
variable. The XPath forms that are supported allow elements and attributes
to be individually included in the generated SAS data set.
- syntax="type"
-
is an attribute that
specifies the type of syntax used in the location path. The syntax
is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes
to be individually included in the generated SAS data set.
Default:XPath
Requirements:The value must be XPath or XPathENR.
If an XML namespace is defined with the NAMESPACES element,
you must specify the type of syntax as XPathENR (XPath with Embedded
Namespace Reference). This is because the syntax is different from
the XPath specification. For example, syntax="XPathENR"
.
To specify the PATH
location path, use one of the following forms:
CAUTION:
These forms
are the only XPath forms that the XML engine supports.
If you use any other
valid W3C form, the results will be unpredictable.
- element-form
-
selects PCDATA (parsed
character data) from a named element. The following element forms
enable you to select from a named element, conditionally select from
a named element based on a specific attribute value, or conditionally
select from a named element based on a specific occurrence of the
element using the position function:
<PATH> /LEVEL/ITEM </PATH>
<PATH> /LEVEL/ITEM[@attr="value"] </PATH>
<PATH> /LEVEL/ITEM[position()=n]|[n] </PATH>
-
The following location path tells
the XML engine to scan the XML markup until it finds the CONFERENCE
element. The XML engine retrieves the value between the <CONFERENCE>
start tag and the </CONFERENCE> end tag.
<PATH> /NHL/CONFERENCE </PATH>
-
The following location path tells
the XML engine to scan the XML markup until it finds the TEAM element
where the value of the founded= attribute is 1993. The XML engine
retrieves the value between the <TEAM> start tag and the </TEAM>
end tag.
<PATH> /NHL/CONFERENCE/DIVISION/TEAM[@founded="1993"] </PATH>
-
The following location path uses
the position function to tell the XML engine to scan the XML markup
until it finds the fifth occurrence of the TEAM element. The XML engine
retrieves the value between the <TEAM> start tag and the </TEAM>
end tag.
<PATH> /NHL/CONFERENCE/DIVISION/TEAM[position()=5] </PATH>
You can use the following
shorter version for the position function:
<PATH> /NHL/CONFERENCE/DIVISION/TEAM[5] </PATH>
- attribute-form
-
selects values from
an attribute. The following attribute forms enable you to select from
a specific attribute or conditionally select from a specific attribute
based on the value of another attribute:
<PATH> /LEVEL/ITEM/@attr </PATH>
<PATH> /LEVEL/ITEM/@attr[attr2="value"] </PATH
-
The following location path tells
the XML engine to scan the XML markup until it finds the TEAM element.
The XML engine retrieves the value from the abbrev= attribute.
<PATH syntax="XPath"> /NHL/CONFERENCE/DIVISION/TEAM/@abbrev </PATH>
-
The following location path tells
the XML engine to scan the XML markup until it finds the TEAM element.
The XML engine retrieves the value from the founded= attribute where
the value of the abbrev= attribute is ATL. The two attributes must
be for the same element.
<PATH> /NHL/CONFERENCE/DIVISION/TEAM/@founded[@abbrev="ATL"] </PATH>
Requirements:Whether the PATH element is required or allowed is determined
by the class="ORDINAL" attribute for the COLUMN element. If the class="ORDINAL"
attribute is not specified, which is the default, PATH is required
and INCREMENT-PATH, DECREMENT-PATH, and RESET-PATH are not allowed.
If the class="ORDINAL" attribute is specified, PATH is not allowed,
INCREMENT-PATH or DECREMENT-PATH is required, and RESET-PATH is optional.
If an XML namespace is defined with the NAMESPACES element,
you must include the identification number in the location path preceding
the element that is being defined. The identification number is enclosed
in braces. For example, <PATH syntax="XPathENR">/Table/Hurricane/{1}Month</PATH>
. See Including Namespace Elements in an XMLMap.
The XPath construction is a formal specification that
puts a path description similar to UNIX on each element of the XML
structure. XPath syntax is case sensitive. For example, if an element
tag name is uppercase, it must be uppercase in the location path.
If it is lowercase, it must be lowercase in the location path. All
location paths must begin with the root-enclosing element (denoted
by a slash '/'), or with the "any parent" variant (denoted by double
slashes '//'). Other W3C documented forms are not currently supported.
- INCREMENT-PATH syntax="type"
beginend="BEGIN | END"
-
specifies a location
path for a counter variable, which is established by specifying the
COLUMN element attribute class="ORDINAL". The location path tells
the XML engine where in the input data to increment the accumulated
value for the counter variable by 1.
- syntax="type"
-
is an optional attribute
that specifies the type of syntax in the location path. The syntax
is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes
to be individually included in the generated SAS data set. For example, syntax="XPath"
.
Default:XPath
Requirements:The value must be XPath or XPathENR.
If an XML namespace is defined with the NAMESPACES element,
you must specify the type of syntax as XPathENR (XPath with Embedded
Namespace Reference). This is because the syntax is different from
the XPath specification. For example, syntax="XPathENR"
.
- beginend="BEGIN | END"
-
is an optional attribute
that specifies to stop processing when either the element start tag
is encountered or the element end tag is encountered.
Requirements:If an XML namespace is defined with the NAMESPACES element,
you must include the identification number in the location path preceding
the element that is being defined. The identification number is enclosed
in braces. For example, <INCREMENT-PATH syntax="XPathENR">/Table/Hurricane/{1}Month</INCREMENT-PATH>
.
The XPath construction is a formal specification that
puts a path description similar to UNIX on each element of the XML
structure. Note that XPath syntax is case sensitive. For example,
if an element tag name is uppercase, it must be uppercase in the location
path. If it is lowercase, it must be lowercase. All location paths
must begin with the root-enclosing element (denoted by a slash '/')
or with the "any parent" variant (denoted by double slashes '//').
Other W3C documented forms are not currently supported.
If the variable is not a counter variable, PATH is required
and INCREMENT-PATH and RESET-PATH are not allowed. If the variable
is a counter variable, PATH is not allowed and either INCREMENT-PATH
or DECREMENT-PATH is required.
- RESET-PATH syntax="type"
beginend="BEGIN | END"
-
specifies a location
path for a counter variable, which is established by specifying the
COLUMN element attribute class="ORDINAL". The location path tells
the XML engine where in the XML document to reset the accumulated
value for the counter variable to zero.
- syntax="type"
-
is an optional attribute
that specifies the type of syntax in the location path. The syntax
is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes
to be individually included in the generated SAS data set. For example, syntax="XPATH"
.
Default:XPath
Requirements:The value must be XPath or XPathENR.
If an XML namespace is defined with the NAMESPACES element,
you must specify the type of syntax as XPathENR (XPath with Embedded
Namespace Reference). This is because the syntax is different from
the XPath specification. For example, syntax="XPathENR"
.
- beginend="BEGIN | END"
-
is an optional attribute
that specifies to stop processing when either the element start tag
is encountered or the element end tag is encountered.
Requirements:If the variable is not a counter variable, RESET-PATH
is not allowed. If the variable is a counter variable, RESET-PATH
is optional.
If an XML namespace is defined with the NAMESPACES element,
you must include the identification number in the location path preceding
the element that is being defined. The identification number is enclosed
in braces. For example, <RESET-PATH syntax="XPathENR">/Table/Hurricane/{1}Month</RESET-PATH>
.
The XPath construction is a formal specification that
puts a path description similar to UNIX on each element of the XML
structure. Note that XPath syntax is case sensitive. For example,
if an element tag name is uppercase, it must be uppercase in the location
path. If it is lowercase, it must be lowercase. All location paths
must begin with the root-enclosing element (denoted by a slash '/')
or with the "any parent" variant (denoted by double slashes '//').
Other W3C documented forms are not currently supported.
- DECREMENT-PATH syntax="type"
beginend="BEGIN | END"
-
specifies a location
path for a counter variable, which is established by specifying the
COLUMN element attribute class="ORDINAL". The location path tells
the XML engine where in the input data to decrement the accumulated
value for the counter variable by 1.
- syntax="type"
-
is an optional attribute
that specifies the type of syntax in the location path. The syntax
is valid XPath construction in compliance with the W3C specifications.
The XPath form supported by the XML engine allows elements and attributes
to be individually included in the generated SAS data set. For example, syntax="XPath"
.
Default:XPath
Requirements:The value must be XPath or XPathENR.
If an XML namespace is defined with the NAMESPACES element,
you must specify the type of syntax as XPathENR (XPath with Embedded
Namespace Reference). This is because the syntax is different from
the XPath specification. For example, syntax="XPathENR"
.
- beginend="BEGIN | END"
-
is an optional attribute
that specifies to stop processing when either the element start tag
is encountered, or the element end tag is encountered.
Requirements:If the variable is not a counter variable, DECREMENT-PATH
is not allowed. If the variable is a counter variable, either DECREMENT-PATH
or INCREMENT-PATH is required.
If an XML namespace is defined with the NAMESPACES element,
you must include the identification number in the location path preceding
the element that is being defined. The identification number is enclosed
in braces. For example, <DECREMENT-PATH syntax="XPathENR">/Table/Hurricane/{1}Month</DECREMENT-PATH>
.
The XPath construction is a formal specification that
puts a path description similar to UNIX on each element of the XML
structure. XPath syntax is case sensitive. For example, if an element
tag name is uppercase, it must be uppercase in the location path.
If it is lowercase, it must be lowercase in the location path. All
location paths must begin with the root-enclosing element (denoted
by a slash '/'), or with the "any parent" variant (denoted by double
slashes '//'). Other W3C documented forms are not currently supported.