The X12 Procedure

Special Data Sets

The X12 procedure can input the MDLINFOIN= and output the MDLINFOOUT= data sets. The structure of both of these data sets is the same. The difference is that when the MDLINFOIN= data set is read, only information relative to specifying a model is processed, whereas the MDLINFOOUT= data set contains the results of estimating a model. The X12 procedure can also read data sets that contain event definition data. The structure of these data sets is the same as in the SAS® High Performance Forecasting system.

MDLINFOIN= and MDLINFOOUT= Data Sets

The MDLINFOIN= and MDLINFOOUT= data sets can contain the following variables:

BY variables

enable the model information to be specified by BY groups. BY variables can be included in this data set that match the BY variables used to process the series. If no BY variables are included, then the models specified by _NAME_ in the MDLINFOIN= data set apply to all BY groups in the DATA= data set.

_NAME_

should contain the variable name of the time series to which a particular model is to be applied. Omit the _NAME_ variable if you are specifying the same model for all series in a BY group.

_MODELTYPE_

specifies whether the observation contains regression or ARIMA information. The value of _MODELTYPE_ should be either REG to supply regression information or ARIMA to supply model information. If valid regression information exists in the MDLINFOIN= data set for a BY group and series being processed, then the REGRESSION, INPUT, and EVENT statements are ignored for that BY group and series. Likewise, if valid ARIMA model information exists in the data set, then the AUTOMDL, ARIMA, and TRANSFORM statements are ignored. Valid values for the other variables in the data set depend on the value of the _MODELTYPE_ variable. Although other values of _MODELTYPE_ might be permitted in other SAS procedures, PROC X12 recognizes only REG and ARIMA.

_MODELPART_

further qualifies the regression information in the observation. For _MODELTYPE_=REG, valid values of _MODELPART_ are INPUT, EVENT, and PREDEFINED. A value of INPUT indicates that this observation refers to the user-defined variable whose name is given in _DSVAR_. Likewise, a value of EVENT indicates that the observation refers to the SAS or user-defined event whose name is given in _DSVAR_. PREDEFINED indicates that the name given in _DSVAR_ is a predefined U.S. Census Bureau variable. If only ARIMA model information is included in the data set (that is, all observations have _MODELTYPE_=ARIMA), then the _MODELPART_ variable can be omitted. For observations where _MODELTYPE_=ARIMA, valid values for _MODELPART_ are FORECAST, ".", or blank.

_COMPONENT_

further qualifies the regression or ARIMA information in the observation. For _MODELTYPE_=REG, the only valid value of _COMPONENT_ is SCALE. For _MODELTYPE_= ARIMA, the valid values of _COMPONENT_ are TRANSFORM, CONSTANT, NONSEASONAL, and SEASONAL. TRANSFORM indicates that the observation contains the information that would be supplied in the TRANSFORM statement. CONSTANT is specified to control the constant term in the model. NONSEASONAL and SEASONAL refer to the AR, MA, and differencing terms in the ARIMA model.

_PARMTYPE_

further qualifies the regression or ARIMA information in the observation. For _MODELTYPE_=REG, the value of _PARMTYPE_ is the same as the value of the USERTYPE= option in the REGRESSION statement. Since the USERTYPE= option applies only to user-defined events and variables, the value of _PARMTYPE_ does not alter processing in observations where _MODELPART_=PREDEFINED. However, it is consistent to use a value for _PARMTYPE_ that matches the U.S. Census Bureau predefined variable. For the constant term in the model information, _PARMTYPE_ should be SCALE. For transformation information, the value of _PARMTYPE_ should be NONE, LOG, LOGIT, SQRT, or BOXCOX. For _MODELTYPE_=ARIMA, valid values of _PARMTYPE_ are AR, MA, and DIF.

_DSVAR_

specifies the variable name associated with the current observation. For _MODELTYPE_=REG, the value of _DSVAR_ is the name of the user-defined variable, the event, or the U.S. Census Bureau predefined variable. For _MODELTYPE_=ARIMA, _DSVAR_ should match the name of the series being processed. If the ARIMA model information applies to more than one series, then _DSVAR_ can be blank or ".", equivalently.

_VALUE_

contains a numerical value that is used as a parameter for certain types of information. For example, the PREDEFINED=EASTER(6) option in the REGESSION statement is implemented in the MDLINFOIN= data set by using _DSVAR_=EASTER and _VALUE_=6. For a BOXCOX transformation, _VALUE_ is set equal to the $\lambda $ parameter value. For _COMPONENT_=SEASONAL, if _VALUE_ is nonmissing, then _VALUE_ is used as the seasonal period. If _VALUE_ is missing for _COMPONENT_=SEASONAL, then the seasonal period is determined by the interval of the series.

_FACTOR_

applies only to the AR and MA portions of the ARIMA model. The value of _FACTOR_ identifies the factor of the given AR or MA term. Therefore, the value of _FACTOR_ is the same for all observations that are related to the same factor.

_LAG_

identifies the degree for differencing and AR and MA lags. If _COMPONENT_=SEASONAL, then the value in _LAG_ is multiplied by the seasonal period indicated by the value of _VALUE_.

_SHIFT_

contains the shift value for transfer functions. This value is not processed by PROC X12, but it might be processed by other procedures in which transfer functions can be specified.

_NOEST_

indicates whether a parameter associated with the observation is to be estimated. For example, the NOINT option is indicated by _COMPONENT_=CONSTANT with _NOEST_=1 and _EST_=0. _NOEST_=1 indicates that the value in _EST_ is a fixed value. _NOEST_ pertains to the constant term, to AR and MA parameters, and to regression parameters.

_EST_

contains an initial or fixed value for a parameter associated with the observation that is to be estimated. _NOEST_=1 indicates the value in _EST_ is a fixed value. _EST_ pertains to the constant term, to AR and MA parameters, and to regression parameters.

_STDERR_

contains output information about estimated parameters. The variable _STDERR_ is not processed by the MDLINFOIN= data set for PROC X12. In the MDLINFOOUT= data set, _STDERR_ contains the standard error that pertains to the estimated parameter in the variable _EST_.

_TVALUE_

contains output information about estimated parameters. The variable _TVALUE_ is not processed by the MDLINFOIN= data set for PROC X12. In the MDLINFOOUT= data set, _TVALUE_ contains the t value that pertains to the estimated parameter in the variable _EST_.

_PVALUE_

contains output information about estimated parameters. The variable _PVALUE_ is not processed by the MDLINFOIN= data set for PROC X12. In the MDLINFOOUT= data set, _PVALUE_ contains the p-value that pertains to the estimated parameter in the variable _EST_.

INEVENT= Data Set

The INEVENT= data set can contain the following variables. When a variable is omitted from the data set, that variable is assumed to have the default value for all observations. The default values are specified in the list.

_NAME_

specifies the event variable name. _NAME_ is displayed with the case preserved. Since _NAME_ is a SAS variable name, the event can be referenced by using any case. The _NAME_ variable is required; there is no default.

_CLASS_

specifies the class of event: SIMPLE, COMBINATION, PREDEFINED. The default for _CLASS_ is SIMPLE.

_KEYNAME_

contains either a date keyword (SIMPLE EVENT), a predefined event variable name (PREDEFINED EVENT), or an event name (COMBINATION EVENT). All _KEYNAME_ values are displayed in upper case. However, if the _KEYNAME_ value refers to an event name, then the actual name can be of mixed case. The default for _KEYNAME_ is no keyname, designated by ".".

_STARTDATE_

contains either the date timing value or the first date timing value to use in a do-list. The default for _STARTDATE_ is no date, designated by a missing value.

_ENDDATE_

contains the last date timing value to use in a do-list. The default for _ENDDATE_ is no date, designated by a missing value.

_DATEINTRVL_

contains the interval for the date do-list. The default for _DATEINTRVL_ is no interval, designated by ".".

_STARTDT_

contains either the datetime timing value or the first datetime timing value to use in a do-list. The default for _STARTDT_ is no datetime, designated by a missing value.

_ENDDT_

contains the last datetime timing value to use in a do-list. The default for _ENDDT_ is no datetime, designated by a missing value.

_DTINTRVL_

contains the interval for the datetime do-list. The default for _DTINTRVL_ is no interval, designated by ".".

_STARTOBS_

contains either the observation number timing value or the first observation number timing value to use in a do-list. The default for _STARTOBS_ is no observation number, designated by a missing value.

_ENDOBS_

contains the last observation number timing value to use in a do-list. The default for _ENDOBS_ is no observation number, designated by a missing value.

_OBSINTRVL_

contains the interval length of the observation number do-list. The default for _OBSINTRVL_ is no interval, designated by ".".

_TYPE_

specifies the type of event. The valid values of _TYPE_ are POINT, LS, RAMP, TR, TEMPRAMP, TC, LIN, LINEAR, QUAD, CUBIC, INV, INVERSE, LOG, and LOGARITHMIC. The default for _TYPE_ is POINT.

_VALUE_

specifies the value for nonzero observation. The default for _VALUE_ is 1.0.

_PULSE_

specifies the interval that defines the units for the duration values. The default for _PULSE_ is no interval, designated by ".".

_DUR_BEFORE_

specifies the number of durations before the timing value. The default for _DUR_BEFORE_ is 0.

_DUR_AFTER_

specifies the number of durations after the timing value. The default for _DUR_AFTER_ is 0.

_SLOPE_BEF_

determines whether the curve is GROWTH or DECAY before the timing value for _TYPE_=RAMP, _TYPE_=TEMPRAMP, and _TYPE_=TC. Valid values are GROWTH and DECAY. The default for _SLOPE_BEF_ is GROWTH.

_SLOPE_AFT_

determines whether the curve is GROWTH or DECAY after the timing value for _TYPE_=RAMP, _TYPE_=TEMPRAMP, and _TYPE_=TC. Valid values are GROWTH and DECAY. The default for _SLOPE_AFT_ is GROWTH unless _TYPE_=TC; then the default is DECAY.

_SHIFT_

specifies the number of _PULSE_= intervals to shift the timing value. The shift can be positive (forward in time) or negative (backward in time). If _PULSE_= is not specified, then the shift is in observations. The default for _SHIFT_ is 0.

_TCPARM_

specifies the parameter for EVENT of TYPE=TC. The default for _TCPARM_ is 0.5.

_RULE_

specifies the rule to use when combining events or when timing values of an event overlap. The valid values of _RULE_ are ADD, MAX, MIN, MINNZ, MINMAG, and MULT. The default for _RULE_ is ADD.

_PERIOD_

specifies the frequency interval at which the event should be repeated. If this value is missing, then the event is not periodic. The default for _PERIOD_ is no interval, designated by ".".

_LABEL_

specifies the label or description for the event. If a label is not specified, then the default label value is displayed as ".". For events that produce dummy variables, either the user-supplied label or the default label is used. For COMPLEX events, the _LABEL_ value is merely a description of the group of events.

OUTSTAT= Data Set

The OUTSTAT= data set can contain the following variables:

BY variables

sorts the statistics into BY groups. BY variables are included in this data set that match the BY variables used to process the series.

NAME

specifies the variable name of the time series to which the statistics apply.

STAT

describes the statistic that is stored in VALUE or CVALUE. STAT takes on the following values:

Period

the period of the series, 4 or 12.

Mode

the mode of the seasonal adjustment from the X11 statement. Possible values are ADD, MULT, LOGADD, and PSEUDOADD.

Start

the beginning of the model span expressed as monyyyy for monthly series or yyyyQq for quarterly series.

End

the end of the model span expressed as monyyyy for monthly series or yyyyQq for quarterly series.

NbFcst

the number of forecast observations.

SigmaLimLower

the lower sigma limit in standard deviation units.

SigmaLimUpper

the upper sigma limit in standard deviation units.

pLBQ_24

the Ljung-Box Q statistic of the residuals at lag 24, for monthly series. Note that lag 12 (pLBQ_12) and lag 16 (pLBQ_16) are included in the data set for quarterly series.

D8Fs

the stable seasonality F test value from Table D8.

D8Fm

the moving seasonality F test value from Table D8.

ISRatio

the final irregular-to-seasonal ratio from Table F 2.H.

SMA_ALL

the final seasonal moving average filter for all periods.

RSF

the residual seasonality F test value for Table D11 for the entire series.

RSF3

the residual seasonality F test value for Table D11 for the last three years.

RSFA

the residual seasonality F test value for Table D11.A for the entire series.

RSF3A

the residual seasonality F test value for Table D11.A for the last three years.

RSFR

the residual seasonality F test value for Table D11.R for the entire series.

RSF3R

the residual seasonality F test value for Table D11.R for the last three years.

TMA

the Henderson trend moving average filter selected.

ICRatio

the final irregular-to-trend cycle ratio from Table F 2.H.

E5sd

the standard deviation from Table E5.

E6sd

the standard deviation from Table E6.

E6Asd

the standard deviation from Table E6.A.

MCD

months of cyclical dominance.

Q

the overall level Q from Table F3.

Q2

Q overall level without M2 from Table F3.

FMT

indicates whether the format is numeric or character. FMT="NUM" if the value is numeric and stored in the VALUE variable. FMT="CHAR" if the value is a string and stored in the CVALUE variable.

VALUE

contains the numerical value of the statistic or missing if the statistic is of type character.

CVALUE

contains the character value of the text statistic or blank if the statistic is of type numeric.