The X12 Procedure

Special Data Sets

MDLINFOIN= and MDLINFOOUT= Data Sets
INEVENT= Data Set
OUTSTAT= Data Set

The X12 procedure can input the MDLINFOIN= and output the MDLINFOOUT= data sets. The structure of both of these data sets is the same. The difference is that when the MDLINFOIN= data set is read, only information relative to specifying a model is processed, whereas the MDLINFOOUT= data set contains the results of estimating a model. The X12 procedure can also read data sets that contain event definition data. The structure of these data sets is the same as in the SAS^®High Performance Forecasting system.

MDLINFOIN= and MDLINFOOUT= Data Sets

The MDLINFOIN= and MDLINFOOUT= data sets can contain the following variables:

BY variables: enable the model information to be specified by BY groups. BY variables can be included in this data set that match the BY variables used to process the series. If no BY variables are included, then the models specified by _NAME_ in the MDLINFOIN= data set apply to all BY groups in the DATA= data set.
_NAME_: should contain the variable name of the time series to which a particular model is to be applied. Omit the _NAME_ variable if you are specifying the same model for all series in a BY group.
_MODELTYPE_: specifies whether the observation contains regression or ARIMA information. The value of _MODELTYPE_ should be either REG to supply regression information or ARIMA to supply model information. If valid regression information exists in the MDLINFOIN= data set for a BY group and series being processed, then the REGRESSION, INPUT, and EVENT statements are ignored for that BY group and series. Likewise, if valid ARIMA model information exists in the data set, then the AUTOMDL, ARIMA, and TRANSFORM statements are ignored. Valid values for the other variables in the data set depend on the value of the _MODELTYPE_ variable. Although other values of _MODELTYPE_ might be permitted in other SAS procedures, PROC X12 recognizes only REG and ARIMA.
_MODELPART_: further qualifies the regression information in the observation. For _MODELTYPE_=REG, valid values of _MODELPART_ are INPUT, EVENT, and PREDEFINED. A value of INPUT indicates that this observation refers to the user-defined variable whose name is given in _DSVAR_. Likewise, a value of EVENT indicates that the observation refers to the SAS or user-defined event whose name is given in _DSVAR_. PREDEFINED indicates that the name given in _DSVAR_ is a predefined U.S. Census Bureau variable. If only ARIMA model information is included in the data set (that is, all observations have _MODELTYPE_=ARIMA), then the _MODELPART_ variable can be omitted. For observations where _MODELTYPE_=ARIMA, valid values for _MODELPART_ are FORECAST, “.”, or blank.
_COMPONENT_: further qualifies the regression or ARIMA information in the observation. For _MODELTYPE_=REG, the only valid value of _COMPONENT_ is SCALE. For _MODELTYPE_= ARIMA, the valid values of _COMPONENT_ are TRANSFORM, CONSTANT, NONSEASONAL, and SEASONAL. TRANSFORM indicates that the observation contains the information that would be supplied in the TRANSFORM statement. CONSTANT is specified to control the constant term in the model. NONSEASONAL and SEASONAL refer to the AR, MA, and differencing terms in the ARIMA model.
_PARMTYPE_: further qualifies the regression or ARIMA information in the observation. For _MODELTYPE_=REG, the value of _PARMTYPE_ is the same as the value of the USERTYPE= option in the REGRESSION statement. Since the USERTYPE= option applies only to user-defined events and variables, the value of _PARMTYPE_ does not alter processing in observations where _MODELPART_=PREDEFINED. However, it is consistent to use a value for _PARMTYPE_ that matches the U.S. Census Bureau predefined variable. For the constant term in the model information, _PARMTYPE_ should be SCALE. For transformation information, the value of _PARMTYPE_ should be NONE, LOG, LOGIT, SQRT, or BOXCOX. For _MODELTYPE_=ARIMA, valid values of _PARMTYPE_ are AR, MA, and DIF.
_DSVAR_: specifies the variable name associated with the current observation. For _MODELTYPE_=REG, the value of _DSVAR_ is the name of the user-defined variable, the event, or the U.S. Census Bureau predefined variable. For _MODELTYPE_=ARIMA, _DSVAR_ should match the name of the series being processed. If the ARIMA model information applies to more than one series, then _DSVAR_ can be blank or “.”, equivalently.
_VALUE_: contains a numerical value that is used as a parameter for certain types of information. For example, the PREDEFINED=EASTER(6) option in the REGESSION statement is implemented in the MDLINFOIN= data set by using _DSVAR_=EASTER and _VALUE_=6. For a BOXCOX transformation, _VALUE_ is set equal to the $\lambda$ parameter value. For _COMPONENT_=SEASONAL, if _VALUE_ is nonmissing, then _VALUE_ is used as the seasonal period. If _VALUE_ is missing for _COMPONENT_=SEASONAL, then the seasonal period is determined by the interval of the series.
_FACTOR_: applies only to the AR and MA portions of the ARIMA model. The value of _FACTOR_ identifies the factor of the given AR or MA term. Therefore, the value of _FACTOR_ is the same for all observations that are related to the same factor.
_LAG_: identifies the degree for differencing and AR and MA lags. If _COMPONENT_=SEASONAL, then the value in _LAG_ is multiplied by the seasonal period indicated by the value of _VALUE_.
_SHIFT_: contains the shift value for transfer functions. This value is not processed by PROC X12, but it might be processed by other procedures in which transfer functions can be specified.
_NOEST_: indicates whether a parameter associated with the observation is to be estimated. For example, the NOINT option is indicated by _COMPONENT_=CONSTANT with _NOEST_=1 and _EST_=0. _NOEST_=1 indicates that the value in _EST_ is a fixed value. _NOEST_ pertains to the constant term, to AR and MA parameters, and to regression parameters.
_EST_: contains an initial or fixed value for a parameter associated with the observation that is to be estimated. _NOEST_=1 indicates the value in _EST_ is a fixed value. _EST_ pertains to the constant term, to AR and MA parameters, and to regression parameters.
_STDERR_: contains output information about estimated parameters. The variable _STDERR_ is not processed by the MDLINFOIN= data set for PROC X12. In the MDLINFOOUT= data set, _STDERR_ contains the standard error that pertains to the estimated parameter in the variable _EST_.
_TVALUE_: contains output information about estimated parameters. The variable _TVALUE_ is not processed by the MDLINFOIN= data set for PROC X12. In the MDLINFOOUT= data set, _TVALUE_ contains the t value that pertains to the estimated parameter in the variable _EST_.
_PVALUE_: contains output information about estimated parameters. The variable _PVALUE_ is not processed by the MDLINFOIN= data set for PROC X12. In the MDLINFOOUT= data set, _PVALUE_ contains the p-value that pertains to the estimated parameter in the variable _EST_.

INEVENT= Data Set

The INEVENT= data set can contain the following variables. When a variable is omitted from the data set, that variable is assumed to have the default value for all observations. The default values are specified in the list.

_NAME_: specifies the event variable name. _NAME_ is displayed with the case preserved. Since _NAME_ is a SAS variable name, the event can be referenced by using any case. The _NAME_ variable is required; there is no default.
_CLASS_: specifies the class of event: SIMPLE, COMBINATION, PREDEFINED. The default for _CLASS_ is SIMPLE.
_KEYNAME_: contains either a date keyword (SIMPLE EVENT), a predefined event variable name (PREDEFINED EVENT), or an event name (COMBINATION EVENT). All _KEYNAME_ values are displayed in upper case. However, if the _KEYNAME_ value refers to an event name, then the actual name can be of mixed case. The default for _KEYNAME_ is no keyname, designated by “.”.
_STARTDATE_: contains either the date timing value or the first date timing value to use in a do-list. The default for _STARTDATE_ is no date, designated by a missing value.
_ENDDATE_: contains the last date timing value to use in a do-list. The default for _ENDDATE_ is no date, designated by a missing value.
_DATEINTRVL_: contains the interval for the date do-list. The default for _DATEINTRVL_ is no interval, designated by “.”.
_STARTDT_: contains either the datetime timing value or the first datetime timing value to use in a do-list. The default for _STARTDT_ is no datetime, designated by a missing value.
_ENDDT_: contains the last datetime timing value to use in a do-list. The default for _ENDDT_ is no datetime, designated by a missing value.
_DTINTRVL_: contains the interval for the datetime do-list. The default for _DTINTRVL_ is no interval, designated by “.”.
_STARTOBS_: contains either the observation number timing value or the first observation number timing value to use in a do-list. The default for _STARTOBS_ is no observation number, designated by a missing value.
_ENDOBS_: contains the last observation number timing value to use in a do-list. The default for _ENDOBS_ is no observation number, designated by a missing value.
_OBSINTRVL_: contains the interval length of the observation number do-list. The default for _OBSINTRVL_ is no interval, designated by “.”.
_TYPE_: specifies the type of event. The valid values of _TYPE_ are POINT, LS, RAMP, TR, TEMPRAMP, TC, LIN, LINEAR, QUAD, CUBIC, INV, INVERSE, LOG, and LOGARITHMIC. The default for _TYPE_ is POINT.
_VALUE_: specifies the value for nonzero observation. The default for _VALUE_ is .
_PULSE_: specifies the interval that defines the units for the duration values. The default for _PULSE_ is no interval, designated by “.”.
_DUR_BEFORE_: specifies the number of durations before the timing value. The default for _DUR_BEFORE_ is 0.
_DUR_AFTER_: specifies the number of durations after the timing value. The default for _DUR_AFTER_ is 0.
_SLOPE_BEF_: determines whether the curve is GROWTH or DECAY before the timing value for _TYPE_=RAMP, _TYPE_=TEMPRAMP, and _TYPE_=TC. Valid values are GROWTH and DECAY. The default for _SLOPE_BEF_ is GROWTH.
_SLOPE_AFT_: determines whether the curve is GROWTH or DECAY after the timing value for _TYPE_=RAMP, _TYPE_=TEMPRAMP, and _TYPE_=TC. Valid values are GROWTH and DECAY. The default for _SLOPE_AFT_ is GROWTH unless _TYPE_=TC; then the default is DECAY.
_SHIFT_: specifies the number of _PULSE_= intervals to shift the timing value. The shift can be positive (forward in time) or negative (backward in time). If _PULSE_= is not specified, then the shift is in observations. The default for _SHIFT_ is .
_TCPARM_: specifies the parameter for EVENT of TYPE=TC. The default for _TCPARM_ is .
_RULE_: specifies the rule to use when combining events or when timing values of an event overlap. The valid values of _RULE_ are ADD, MAX, MIN, MINNZ, MINMAG, and MULT. The default for _RULE_ is ADD.
_PERIOD_: specifies the frequency interval at which the event should be repeated. If this value is missing, then the event is not periodic. The default for _PERIOD_ is no interval, designated by “.”.
_LABEL_: specifies the label or description for the event. If a label is not specified, then the default label value is displayed as “.”. For events that produce dummy variables, either the user-supplied label or the default label is used. For COMPLEX events, the _LABEL_ value is merely a description of the group of events.

OUTSTAT= Data Set

The OUTSTAT= data set can contain the following variables:

BY variables

sorts the statistics into BY groups. BY variables are included in this data set that match the BY variables used to process the series.

NAME

specifies the variable name of the time series to which the statistics apply.

STAT

describes the statistic that is stored in VALUE or CVALUE. STAT takes on the following values:

Period: the period of the series, 4 or 12.
Mode: the mode of the seasonal adjustment from the X11 statement. Possible values are ADD, MULT, LOGADD, and PSEUDOADD.
Start: the beginning of the model span expressed as monyyyy for monthly series or yyyyQq for quarterly series.
End: the end of the model span expressed as monyyyy for monthly series or yyyyQq for quarterly series.
NbFcst: the number of forecast observations.
SigmaLimLower: the lower sigma limit in standard deviation units.
SigmaLimUpper: the upper sigma limit in standard deviation units.
pLBQ_24: the Ljung-Box Q statistic of the residuals at lag 24, for monthly series. Note that lag 12 (pLBQ_12) and lag 16 (pLBQ_16) are included in the data set for quarterly series.
D8Fs: the stable seasonality F test value from Table D8.
D8Fm: the moving seasonality F test value from Table D8.
ISRatio: the final irregular-to-seasonal ratio from Table F 2.H.
SMA_ALL: the final seasonal moving average filter for all periods.
RSF: the residual seasonality F test value for Table D11 for the entire series.
RSF3: the residual seasonality F test value for Table D11 for the last three years.
RSFA: the residual seasonality F test value for Table D11.A for the entire series.
RSF3A: the residual seasonality F test value for Table D11.A for the last three years.
RSFR: the residual seasonality F test value for Table D11.R for the entire series.
RSF3R: the residual seasonality F test value for Table D11.R for the last three years.
TMA: the Henderson trend moving average filter selected.
ICRatio: the final irregular-to-trend cycle ratio from Table F 2.H.
E5sd: the standard deviation from Table E5.
E6sd: the standard deviation from Table E6.
E6Asd: the standard deviation from Table E6.A.
MCD: months of cyclical dominance.
Q: the overall level Q from Table F3.
Q2: Q overall level without M2 from Table F3.

FMT

indicates whether the format is numeric or character. FMT=“NUM” if the value is numeric and stored in the VALUE variable. FMT=“CHAR” if the value is a string and stored in the CVALUE variable.

VALUE

contains the numerical value of the statistic or missing if the statistic is of type character.

CVALUE

contains the character value of the text statistic or blank if the statistic is of type numeric.