This section presents an introductory description of the important topics that are directly related to TIMSAC IML subroutines. The computational details, including algorithms, are described in the section Computational Details. A detailed explanation of each subroutine is not given; instead, basic ideas and common methodologies for all subroutines are described first and are followed by more technical details. Finally, missing values are discussed in the section Missing Values.
The AIC statistic is widely used to select the best model among alternative parametric models. The minimum AIC model selection
procedure can be interpreted as a maximization of the expected entropy (Akaike, 1981). The entropy of a true probability density function (PDF) with respect to the fitted PDF
is written as
![]() |
where is a Kullback-Leibler information measure, which is defined as
![]() |
where the random variable is assumed to be continuous. Therefore,
![]() |
where and E
denotes the expectation concerning the random variable
.
if and only if
(a.s.). The larger the quantity E
, the closer the function
is to the true PDF
. Given the data
that has the same distribution as the random variable
, let the likelihood function of the parameter vector
be
. Then the average of the log-likelihood function
is an estimate of the expected value of
. Akaike (1981) derived the alternative estimate of E
by using the Bayesian predictive likelihood. The AIC is the bias-corrected estimate of
, where
is the maximum likelihood estimate.
![]() |
Let be a
parameter vector that is contained in the parameter space
. Given the data
, the log-likelihood function is
![]() |
Suppose the probability density function has the true PDF
, where the true parameter vector
is contained in
. Let
be a maximum likelihood estimate. The maximum of the log-likelihood function is denoted as
. The expected log-likelihood function is defined by
![]() |
The Taylor series expansion of the expected log-likelihood function around the true parameter gives the following asymptotic relationship:
![]() |
where is the information matrix and
stands for asymptotic equality. Note that
since
is maximized at
. By substituting
, the expected log-likelihood function can be written as
![]() |
The maximum likelihood estimator is asymptotically normally distributed under the regularity conditions
![]() |
Therefore,
![]() |
The mean expected log-likelihood function, , becomes
![]() |
When the Taylor series expansion of the log-likelihood function around is used, the log-likelihood function
is written
![]() |
Since is the maximum log-likelihood function,
. Note that
if the maximum likelihood estimator
is a consistent estimator of
. Replacing
with the true parameter
and taking expectations with respect to the random variable
,
![]() |
Consider the following relationship:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
From the previous derivation,
![]() |
Therefore,
![]() |
The natural estimator for E is
. Using this estimator, you can write the mean expected log-likelihood function as
![]() |
Consequently, the AIC is defined as an asymptotically unbiased estimator of
![]() |
In practice, the previous asymptotic result is expected to be valid in finite samples if the number of free parameters does
not exceed and the upper bound of the number of free parameters is
. It is worth noting that the amount of AIC is not meaningful in itself, since this value is not the Kullback-Leibler information
measure. The difference of AIC values can be used to select the model. The difference of the two AIC values is considered
insignificant if it is far less than 1. It is possible to find a better model when the minimum AIC model contains many free
parameters.
Consider the time series :
![]() |
where is an unknown smooth function and
is an
random variable with zero mean and positive variance
. Whittaker (1923) provides the solution, which balances a tradeoff between closeness to the data and the
th-order difference equation. For a fixed value of
and
, the solution
satisfies
![]() |
where denotes the
th-order difference operator. The value of
can be viewed as the smoothness tradeoff measure. Akaike (1980a) proposed the Bayesian posterior PDF to solve this problem.
![]() |
Therefore, the solution can be obtained when the function is maximized.
Assume that time series is decomposed as follows:
![]() |
where denotes the trend component and
is the seasonal component. The trend component follows the
th-order stochastically perturbed difference equation.
![]() |
For example, the polynomial trend component for is written as
![]() |
To accommodate regular seasonal effects, the stochastic seasonal relationship is used.
![]() |
where is the number of seasons within a period. In the context of Whittaker and Akaike, the smoothness priors problem can be solved
by the maximization of
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The values of hyperparameters and
refer to a measure of uncertainty of prior information. For example, the large value of
implies a relatively smooth trend component. The ratio
can be considered as a signal-to-noise ratio.
Kitagawa and Gersch (1984) use the Kalman filter recursive computation for the likelihood of the tradeoff parameters. The hyperparameters are estimated by combining the grid search and optimization method. The state space model and Kalman filter recursive computation are discussed in the section State Space and Kalman Filter Method.
Seasonal phenomena are frequently observed in many economic and business time series. For example, consumption expenditure might have strong seasonal variations because of Christmas spending. The seasonal phenomena are repeatedly observed after a regular period of time. The number of seasons within a period is defined as the smallest time span for this repetitive observation. Monthly consumption expenditure shows a strong increase during the Christmas season, with 12 seasons per period.
There are three major approaches to seasonal time series: the regression model, the moving average model, and the seasonal ARIMA model.
Let the trend component be and the seasonal component be
. Then the additive time series can be written as the regression model
![]() |
In practice, the trend component can be written as the th-order polynomial, such as
![]() |
The seasonal component can be approximated by the seasonal dummies
![]() |
where is the number of seasons within a period. The least squares method is applied to estimate parameters
and
.
The seasonally adjusted series is obtained by subtracting the estimated seasonal component from the original series. Usually,
the error term is assumed to be white noise, while sometimes the autocorrelation of the regression residuals needs to be allowed. However,
the regression method is not robust to the regression function type, especially at the beginning and end of the series.
If you assume that the annual sum of a seasonal time series has small seasonal fluctuations, the nonseasonal component can be estimated by using the moving average method.
![]() |
where is the positive integer and
is the symmetric constant such that
and
.
When the data are not available, either an asymmetric moving average is used, or the forecast data are augmented to use the symmetric weight. The X-11 procedure is a complex modification of this moving-average method.
The regression and moving-average approaches assume that the seasonal component is deterministic and independent of other nonseasonal components. The time series approach is used to handle the stochastic trend and seasonal components.
The general ARIMA model can be written
![]() |
where is the backshift operator and
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
and if
; otherwise,
. The power of
,
, can be considered as a seasonal factor. Specifically, the Box-Jenkins multiplicative seasonal ARIMA
model is written as
![]() |
ARIMA modeling is appropriate for particular time series and requires burdensome computation.
The TSBAYSEA subroutine combines the simple characteristics of the regression approach and time series modeling. The TSBAYSEA and X-11 procedures use the model-based seasonal adjustment. The symmetric weights of the standard X-11 option can be approximated by using the integrated MA form
![]() |
With a fixed value , the TSBAYSEA subroutine is approximated as
![]() |
The subroutine is flexible enough to handle trading-day or leap-year effects, the shift of the base observation, and missing values. The TSBAYSEA-type modeling approach has some advantages: it clearly defines the statistical model of the time series; modification of the basic model can be an efficient method of choosing a particular procedure for the seasonal adjustment of a given time series; and the use of the concept of the likelihood provides a minimum AIC model selection approach.
The subroutines TSMLOCAR, TSMLOMAR, and TSTVCAR are used to analyze nonstationary time series models. The AIC statistic is extensively used to analyze the locally stationary model.
When the time series is nonstationary, the TSMLOCAR (univariate) and TSMLOMAR (multivariate) subroutines can be employed. The whole span of the series is divided into locally stationary blocks of data, and then the TSMLOCAR and TSMLOMAR subroutines estimate a stationary AR model by using the least squares method on this stationary block. The homogeneity of two different blocks of data is tested by using the AIC.
Given a set of data , the data can be divided into
blocks of sizes
, where
, and
and
are unknown. The locally stationary model is fitted to the data
![]() |
where
![]() |
where is a Gaussian white noise with
and
. Therefore, the log-likelihood function of the locally stationary series is
![]() |
Given ,
, the maximum of the log-likelihood function is attained at
![]() |
The concentrated log-likelihood function is given by
![]() |
Therefore, the maximum likelihood estimates, and
, are obtained by minimizing the following local SSE:
![]() |
The least squares estimation of the stationary model is explained in the section Least Squares and Householder Transformation.
The AIC for the locally stationary model over the pooled data is written as
![]() |
where intercept = 1 if the intercept term is estimated; otherwise, intercept = 0. The number of stationary blocks (
), the size of each block (
), and the order of the locally stationary model is determined by the AIC. Consider the autoregressive model fitted over the
block of data,
, and let this model
be an AR(
) process. When additional data,
, are available, a new model
, an AR(
) process, is fitted over this new data set, assuming that these data are independent of the previous data. Then AICs for
models
and
are defined as
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The joint model AIC for and
is obtained by summation
![]() |
When the two data sets are pooled and estimated over the pooled data set, , the AIC of the pooled model is
![]() |
where is the pooled error variance and
is the order chosen to fit the pooled data set.
Decision
If , switch to the new model, since there is a change in the structure of the time series.
If , pool the two data sets, since two data sets are considered to be homogeneous.
If new observations are available, repeat the preceding steps to determine the homogeneity of the data. The basic idea of locally stationary AR modeling is that, if the structure of the time series is not changed, you should use the additional information to improve the model fitting, but you need to follow the new structure of the time series if there is any change.
Another approach to nonstationary time series, especially those that are nonstationary in the covariance, is time-varying AR coefficient modeling. When the time series is nonstationary in the covariance, the problem in modeling this series is related to an efficient parameterization. It is possible for a Bayesian approach to estimate the model with a large number of implicit parameters of the complex structure by using a relatively small number of hyperparameters.
The TSTVCAR subroutine uses smoothness priors by imposing stochastically perturbed difference equation constraints on each AR coefficient and frequency response function. The variance of each AR coefficient distribution constitutes a hyperparameter included in the state space model. The likelihood of these hyperparameters is computed by the Kalman filter recursive algorithm.
The time-varying AR coefficient model is written
![]() |
where time-varying coefficients are assumed to change gradually with time. The following simple stochastic difference equation constraint is imposed on each
coefficient:
![]() |
The frequency response function of the AR process is written
![]() |
The smoothness of this function can be measured by the th derivative smoothness constraint,
![]() |
Then the TSTVCAR call imposes zero and second derivative smoothness constraints. The time-varying AR coefficients are the solution of the following constrained least squares:
![]() |
where ,
, and
are hyperparameters of the prior distribution.
Using a state space representation, the model is
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The computation of the likelihood function is straightforward. See the section State Space and Kalman Filter Method for the computation method.
The subroutines TSMULMAR, TSMLOMAR, and TSPRED analyze multivariate time series. The periodic AR model, TSPEARS, can also be estimated by using a vector AR procedure, since the periodic AR series can be represented as the covariance-stationary vector autoregressive model.
The stationary vector AR model is estimated and the order of the model (or each variable) is automatically determined by the minimum AIC procedure. The stationary vector AR model is written
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Using the factorization method, the error covariance is decomposed as
![]() |
where is a unit lower triangular matrix and
is a diagonal matrix. Then the instantaneous response model is defined as
![]() |
where ,
for
, and
. Each equation of the instantaneous response model can be estimated independently, since its error covariance matrix has
a diagonal covariance matrix
. Maximum likelihood estimates are obtained through the least squares method when the disturbances are normally distributed
and the presample values are fixed.
The TSMULMAR subroutine estimates the instantaneous response model. The VAR coefficients are computed by using the relationship between the VAR and instantaneous models.
The general VARMA model can be transformed as an infinite-order MA process under certain conditions.
![]() |
In the context of the VAR() model, the coefficient
can be interpreted as the
-lagged response of a unit increase in the disturbances at time
.
![]() |
The lagged response on the one-unit increase in the orthogonalized disturbances is denoted
![]() |
where is the
th column of the unit triangular matrix
and
. When you estimate the VAR model by using the TSMULMAR call, it is easy to compute this impulse response function.
The MSE of the -step prediction is computed as
![]() |
Note that . Then the covariance matrix of
is decomposed
![]() |
where is the
th diagonal element of the matrix
and
is the number of variables. The MSE matrix can be written
![]() |
Therefore, the contribution of the th orthogonalized innovation to the MSE is
![]() |
The th forecast error variance decomposition is obtained from diagonal elements of the matrix
.
The nonstationary multivariate series can be analyzed by the TSMLOMAR subroutine. The estimation and model identification procedure is analogous to the univariate nonstationary procedure, which is explained in the section Nonstationary Time Series.
A time series is periodically correlated with period
if
and
. Let
be autoregressive of period
with AR orders
—that is,
![]() |
where is uncorrelated with mean zero and
,
,
, and
. Define the new variable such that
. The vector series,
, is autoregressive of order
, where
. The TSPEARS subroutine estimates the periodic autoregressive model by using minimum AIC vector AR modeling.
The TSPRED subroutine computes the one-step or multistep forecast of the multivariate ARMA model if the ARMA parameter estimates are provided. In addition, the subroutine TSPRED produces the (intermediate and permanent) impulse response function and performs forecast error variance decomposition for the vector AR model.
The autocovariance function of the random variable is defined as
![]() |
where . When the real valued process
is stationary and its autocovariance is absolutely summable, the population spectral density function is obtained by using
the Fourier transform of the autocovariance function
![]() |
where and
is the autocovariance function such that
.
Consider the autocovariance generating function
![]() |
where and
is a complex scalar. The spectral density function can be represented as
![]() |
The stationary ARMA() process is denoted
![]() |
where and
do not have common roots. Note that the autocovariance generating function of the linear process
is given by
![]() |
For the ARMA() process,
. Therefore, the spectral density function of the stationary ARMA(
) process becomes
![]() |
The spectral density function of a white noise is a constant.
![]() |
The spectral density function of the AR(1) process is given by
![]() |
The spectrum of the AR(1) process has its minimum at and its maximum at
if
, while the spectral density function attains its maximum at
and its minimum at
, if
. When the series is positively autocorrelated, its spectral density function is dominated by low frequencies. It is interesting
to observe that the spectrum approaches
as
. This relationship shows that the series is difference-stationary if its spectral density function has a remarkable peak
near 0.
The spectrum of AR(2) process equals
![]() |
Refer to Anderson (1971) for details of the characteristics of this spectral density function of the AR(2) process.
In practice, the population spectral density function cannot be computed. There are many ways of computing the sample spectral density function. The TSBAYSEA and TSMLOCAR subroutines compute the power spectrum by using AR coefficients and the white noise variance.
The power spectral density function of is derived by using the Fourier transformation of
.
![]() |
where and
denotes frequency. The autocovariance function can also be written as
![]() |
Consider the following stationary AR() process:
![]() |
where is a white noise with mean zero and constant variance
.
The autocovariance function of white noise equals
![]() |
where if
; otherwise,
. Therefore, the power spectral density of the white noise is
,
. Note that, with
,
![]() |
Using the following autocovariance function of ,
![]() |
the autocovariance function of the white noise is denoted as
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
On the other hand, another formula of the gives
![]() |
Therefore,
![]() |
Since , the rational spectrum of
is
![]() |
To compute the power spectrum, estimated values of white noise variance and AR coefficients
are used. The order of the AR process can be determined by using the minimum AIC procedure.
Consider the univariate AR() process
![]() |
Define the design matrix .
![]() |
Let . The least squares estimate,
, is the approximation to the maximum likelihood estimate of
if
is assumed to be Gaussian error disturbances. Combining
and
as
![]() |
the matrix can be decomposed as
![]() |
where is an orthogonal matrix and
is an upper triangular matrix,
, and
.
![]() |
The least squares estimate that uses Householder transformation is computed by solving the linear system
![]() |
The unbiased residual variance estimate is
![]() |
and
![]() |
In practice, least squares estimation does not require the orthogonal matrix . The TIMSAC subroutines compute the upper triangular matrix without computing the matrix
.
Consider the additive time series model
![]() |
Practically, it is not possible to estimate parameters , since the number of parameters exceeds the number of available observations. Let
denote the seasonal difference operator with
seasons and degree of
; that is,
. Suppose that
. Some constraints on the trend and seasonal components need to be imposed such that the sum of squares of
,
, and
is small. The constrained least squares estimates are obtained by minimizing
![]() |
Using matrix notation,
![]() |
where ,
, and
is the initial guess of
. The matrix
is a
control matrix in which structure varies according to the order of differencing in trend and season.
![]() |
where
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The matrix
has the same structure as the matrix
, and
is the
identity matrix. The solution of the constrained least squares method is equivalent to that of maximizing the function
![]() |
Therefore, the PDF of the data is
![]() |
The prior PDF of the parameter vector is
![]() |
When the constant is known, the estimate
of
is the mean of the posterior distribution, where the posterior PDF of the parameter
is proportional to the function
. It is obvious that
is the minimizer of
, where
![]() |
![]() |
The value of is determined by the minimum ABIC procedure. The ABIC is defined as
![]() |
In this section, the mathematical formulas for state space modeling are introduced. The Kalman filter algorithms are derived from the state space model. As an example, the state space model of the TSDECOMP subroutine is formulated.
Define the following state space model:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where and
. If the observations,
, and the initial conditions,
and
, are available, the one-step predictor
of the state vector
and its mean square error (MSE) matrix
are written as
![]() |
![]() |
Using the current observation, the filtered value of and its variance
are updated.
![]() |
![]() |
where and
. The log-likelihood function is computed as
![]() |
where is the conditional variance of the one-step prediction error
.
Consider the additive time series decomposition
![]() |
where is a
regressor vector and
is a
time-varying coefficient vector. Each component has the following constraints:
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where and
. The AR component
is assumed to be stationary. The trading-day component
represents the number of the
th day of the week in time
. If
, and
(monthly data),
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The state vector is defined as
![]() |
The matrix is
![]() |
where
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
where
![]() |
![]() |
Finally, the matrix is time-varying,
![]() |
where
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
The TIMSAC subroutines skip any missing values at the beginning of the data set. When the univariate and multivariate AR models are estimated via least squares (TSMLOCAR, TSMLOMAR, TSUNIMAR, TSMULMAR, and TSPEARS), there are three options available; that is, MISSING=0, MISSING=1, or MISSING=2. When the MISSING=0 (default) option is specified, the first contiguous observations with no missing values are used. The MISSING=1 option specifies that only nonmissing observations should be used by ignoring the observations with missing values. If the MISSING=2 option is specified, the missing values are filled with the sample mean. The least squares estimator with the MISSING=2 option is biased in general.
The BAYSEA subroutine assumes the same prior distribution of the trend and seasonal components that correspond to the missing
observations. A modification is made to skip the components of the vector that correspond to the missing observations. The vector
is defined in the section Bayesian Constrained Least Squares. In addition, the TSBAYSEA subroutine considers outliers as missing values. The TSDECOMP and TSTVCAR subroutines skip the
Kalman filter updating equation when the current observation is missing.