Details :: SAS/IML(R) 12.1 User's Guide

Minimum AIC Procedure

The AIC statistic is widely used to select the best model among alternative parametric models. The minimum AIC model selection procedure can be interpreted as a maximization of the expected entropy (Akaike, 1981). The entropy of a true probability density function (PDF) $\varphi$ with respect to the fitted PDF is written as

$B(\varphi ,f) = -I(\varphi ,f)$

where $I(\varphi ,f)$ is a Kullback-Leibler information measure, which is defined as

$I(\varphi ,f) = \int \left[ \log \left[ \frac{\varphi (z)}{f(z)} \right] \right] \varphi (z) dz$

where the random variable is assumed to be continuous. Therefore,

$B(\varphi ,f) = \mr {E}_ Z \log f(Z) - \mr {E}_ Z \log \varphi (Z)$

where $B(\varphi ,f)\leq 0$ and E denotes the expectation concerning the random variable . $B(\varphi ,f)=0$ if and only if $\varphi =f$ (a.s.). The larger the quantity E $_ Z \log f(Z)$ , the closer the function is to the true PDF $\varphi$ . Given the data $\textbf{y} = (y_1,\ldots , y_ T)^{\prime }$ that has the same distribution as the random variable , let the likelihood function of the parameter vector $\theta$ be $\prod _{t=1}^ T f(y_ t|\theta )$ . Then the average of the log-likelihood function $\frac{1}{T}\sum _{t=1}^ T \log f(y_ t|\theta )$ is an estimate of the expected value of $\log f(Z)$ . Akaike (1981) derived the alternative estimate of E $_ Z \log f(Z)$ by using the Bayesian predictive likelihood. The AIC is the bias-corrected estimate of $-2T\mr {E}_ Z \log f(Z|\hat{\theta })$ , where $\hat{\theta }$ is the maximum likelihood estimate.

$\mr {AIC} = - 2(\mi {maximum log-likelihood}) + 2(\mi {number of free parameters})$

Let $\theta = (\theta _1,\ldots ,\theta _ K)^{\prime }$ be a $K \times 1$ parameter vector that is contained in the parameter space $\Theta _ K$ . Given the data $\textbf{y}$ , the log-likelihood function is

$\ell (\theta ) = \sum _{t=1}^ T \log f(y_ t|\theta )$

Suppose the probability density function $f(y|\theta )$ has the true PDF $\varphi (y) = f(y|\theta ^0)$ , where the true parameter vector $\theta ^0$ is contained in $\Theta _ K$ . Let $\hat{\theta }_ K$ be a maximum likelihood estimate. The maximum of the log-likelihood function is denoted as $\ell (\hat{\theta }_ K) = \max _{\theta \in \Theta _ K}\ell (\theta )$ . The expected log-likelihood function is defined by

$\ell ^*(\theta ) = T\mr {E}_ Z \log f(Z|\theta )$

The Taylor series expansion of the expected log-likelihood function around the true parameter $\theta ^0$ gives the following asymptotic relationship:

$\ell ^*(\theta ) \stackrel{A}{=} \ell ^*(\theta ^0) + T(\theta - \theta ^0)^{\prime }\mr {E}_ Z \frac{\partial \log f(Z|\theta ^0)}{\partial \theta } - \frac{T}{2}(\theta - \theta ^0)^{\prime } I(\theta ^0)(\theta - \theta ^0)$

where $I(\theta ^0)$ is the information matrix and $\stackrel{A}{=}$ stands for asymptotic equality. Note that $\frac{\partial \log f(z|\theta ^0)}{\partial \theta }=0$ since $\log f(z|\theta )$ is maximized at $\theta ^0$ . By substituting $\hat{\theta }_ K$ , the expected log-likelihood function can be written as

$\ell ^*(\hat{\theta }_ K) \stackrel{A}{=} \ell ^*(\theta ^0) - \frac{T}{2}(\hat{\theta }_ K - \theta ^0)^{\prime } I(\theta ^0)(\hat{\theta }_ K - \theta ^0)$

The maximum likelihood estimator is asymptotically normally distributed under the regularity conditions

$\sqrt {T}I(\theta ^0)^{1/2}(\hat{\theta }_ K - \theta ^0) \stackrel{d}{\rightarrow }N(0, I_ K)$

Therefore,

$T(\hat{\theta }_ K - \theta ^0)^{\prime }I(\theta ^0)(\hat{\theta }_ K - \theta ^0) \stackrel{a}{\sim } \chi _ K^2$

The mean expected log-likelihood function, $\ell ^*(K) = \mr {E}_ Y \ell ^*(\hat{\theta }_ K)$ , becomes

$\ell ^*(K) \stackrel{A}{=} \ell ^*(\theta ^0) - \frac{K}{2}$

When the Taylor series expansion of the log-likelihood function around $\hat{\theta }_ K$ is used, the log-likelihood function $\ell (\theta )$ is written

$\ell (\theta ) \stackrel{A}{=} \ell (\hat{\theta }_ K) + (\theta - \hat{\theta }_ K)^{\prime } \left. \frac{\partial \ell (\theta )}{\partial \theta } \right|_{\hat{\theta }_ K} + \frac{1}{2}(\theta - \hat{\theta }_ K)^{\prime } \left. \frac{\partial ^2 \ell (\theta )}{\partial \theta \partial \theta ^{\prime }} \right|_{\hat{\theta }_ K}(\theta - \hat{\theta }_ K)$

Since $\ell (\hat{\theta }_ K)$ is the maximum log-likelihood function, $\left. \frac{\partial \ell (\theta )}{\partial \theta } \right|_{\hat{\theta }_ K}=0$ . Note that $\mr {plim} \left[ -\frac{1}{T} \left. \frac{\partial ^2 \ell (\theta )}{\partial \theta \partial \theta ^{\prime }} \right|_{\hat{\theta }_ K} \right] = I(\theta ^0)$ if the maximum likelihood estimator $\hat{\theta }_ K$ is a consistent estimator of $\theta$ . Replacing $\theta$ with the true parameter $\theta ^0$ and taking expectations with respect to the random variable ,

$\mr {E}_ Y \ell (\theta ^0) \stackrel{A}{=} \mr {E}_ Y \ell (\hat{\theta }_ K) - \frac{K}{2}$

Consider the following relationship:

$\displaystyle \ell ^*(\theta ^0)$	$\displaystyle =$	$\displaystyle T\textrm{E}_ Z \log f(Z\|\theta ^0)$
$\displaystyle$	$\displaystyle =$	$\displaystyle \textrm{E}_ Y \sum _{t=1}^ T \log f(Y_ t\|\theta ^0)$
$\displaystyle$	$\displaystyle =$	$\displaystyle \textrm{E}_ Y \ell (\theta ^0)$

From the previous derivation,

$\ell ^*(K) \stackrel{A}{=} \ell ^*(\theta ^0) - \frac{K}{2}$

Therefore,

$\ell ^*(K) \stackrel{A}{=} \mr {E}_ Y \ell (\hat{\theta }_ K) - K$

The natural estimator for E $_ Y \ell (\hat{\theta }_ K)$ is $\ell (\hat{\theta }_ K)$ . Using this estimator, you can write the mean expected log-likelihood function as

$\ell ^*(K) \stackrel{A}{=} \ell (\hat{\theta }_ K) - K$

Consequently, the AIC is defined as an asymptotically unbiased estimator of $-2(\mi {mean expected log-likelihood})$

$\mr {AIC}(K) = -2\ell (\hat{\theta }_ K) + 2K$

In practice, the previous asymptotic result is expected to be valid in finite samples if the number of free parameters does not exceed $2\sqrt {T}$ and the upper bound of the number of free parameters is $\frac{T}{2}$ . It is worth noting that the amount of AIC is not meaningful in itself, since this value is not the Kullback-Leibler information measure. The difference of AIC values can be used to select the model. The difference of the two AIC values is considered insignificant if it is far less than 1. It is possible to find a better model when the minimum AIC model contains many free parameters.

Smoothness Priors Modeling

Consider the time series :

$y_ t = f(t) + \epsilon _ t$

where is an unknown smooth function and $\epsilon _ t$ is an random variable with zero mean and positive variance $\sigma ^2$ . Whittaker (1923) provides the solution, which balances a tradeoff between closeness to the data and the th-order difference equation. For a fixed value of $\lambda$ and , the solution $\hat{f}$ satisfies

$\min _ f \sum _{t=1}^ T \left\{ [y_ t - f(t)]^2 + \lambda ^2 [\nabla ^ k f(t)]^2 \right\}$

where $\nabla ^ k$ denotes the th-order difference operator. The value of $\lambda$ can be viewed as the smoothness tradeoff measure. Akaike (1980a) proposed the Bayesian posterior PDF to solve this problem.

$\ell (f) = \exp \left\{ -\frac{1}{2 \sigma ^2} \sum _{t=1}^ T [y_ t - f(t)]^2 \right\} \exp \left\{ -\frac{\lambda ^2}{2 \sigma ^2} \sum _{t=1}^ T [\nabla ^ k f(t)]^2 \right\}$

Therefore, the solution can be obtained when the function $\ell (f)$ is maximized.

Assume that time series is decomposed as follows:

$y_ t = T_ t + S_ t + \epsilon _ t$

where denotes the trend component and is the seasonal component. The trend component follows the th-order stochastically perturbed difference equation.

$\nabla ^ k T_ t = w_{1t}, w_{1t} \sim N(0,\tau _1^2)$

For example, the polynomial trend component for is written as

$T_ t = 2T_{t-1} - T_{t-2} + w_{1t}$

To accommodate regular seasonal effects, the stochastic seasonal relationship is used.

$\sum _{i=0}^{L-1} S_{t-i} = w_{2t} w_{2t} \sim N(0,\tau _2^2)$

where is the number of seasons within a period. In the context of Whittaker and Akaike, the smoothness priors problem can be solved by the maximization of

$\displaystyle \ell (f)$	$\displaystyle =$	$\displaystyle \exp \left[ -\frac{1}{2 \sigma ^2} \sum _{t=1}^ T (y_ t - T_ t - S_ t)^2 \right] \exp \left[ -\frac{\tau _1^2}{2 \sigma ^2} \sum _{t=1}^ T (\nabla ^ k T_ t)^2 \right]$
$\displaystyle$	$\displaystyle$	$\displaystyle \times \exp \left[ -\frac{\tau _2^2}{2 \sigma ^2} \sum _{t=1}^ T \left( \sum _{i=0}^{L-1} S_{t-i} \right)^2 \right]$

The values of hyperparameters $\tau _1^2$ and $\tau _2^2$ refer to a measure of uncertainty of prior information. For example, the large value of $\tau _1^2$ implies a relatively smooth trend component. The ratio $\frac{\tau _ i^2}{\sigma ^2}\; (i=1,2)$ can be considered as a signal-to-noise ratio.

Kitagawa and Gersch (1984) use the Kalman filter recursive computation for the likelihood of the tradeoff parameters. The hyperparameters are estimated by combining the grid search and optimization method. The state space model and Kalman filter recursive computation are discussed in the section State Space and Kalman Filter Method.

Bayesian Seasonal Adjustment

Seasonal phenomena are frequently observed in many economic and business time series. For example, consumption expenditure might have strong seasonal variations because of Christmas spending. The seasonal phenomena are repeatedly observed after a regular period of time. The number of seasons within a period is defined as the smallest time span for this repetitive observation. Monthly consumption expenditure shows a strong increase during the Christmas season, with 12 seasons per period.

There are three major approaches to seasonal time series: the regression model, the moving average model, and the seasonal ARIMA model.

Regression Model

Let the trend component be $T_ t = \sum _{i=1}^{m_\alpha } \alpha _ i U_{it}$ and the seasonal component be $S_ t = \sum _{j=1}^{m_\beta } \beta _ j V_{jt}$ . Then the additive time series can be written as the regression model

$y_ t = \sum _{i=1}^{m_{\alpha }} \alpha _ i U_{it} + \sum _{j=1}^{m_{\beta }} \beta _ j V_{jt} + \epsilon _ t$

In practice, the trend component can be written as the $m_{\alpha }$ th-order polynomial, such as

$T_ t = \sum _{i=0}^{m_{\alpha }} \alpha _ i t^ i$

The seasonal component can be approximated by the seasonal dummies $(D_{jt})$

$S_ t = \sum _{j=1}^{L-1} \beta _ j D_{jt}$

where is the number of seasons within a period. The least squares method is applied to estimate parameters $\alpha _ i$ and $\beta _ j$ .

The seasonally adjusted series is obtained by subtracting the estimated seasonal component from the original series. Usually, the error term $\epsilon _ t$ is assumed to be white noise, while sometimes the autocorrelation of the regression residuals needs to be allowed. However, the regression method is not robust to the regression function type, especially at the beginning and end of the series.

Moving Average Model

If you assume that the annual sum of a seasonal time series has small seasonal fluctuations, the nonseasonal component $N_ t = T_ t + \epsilon _ t$ can be estimated by using the moving average method.

$\hat{N}_ t = \sum _{i=-m}^ m \lambda _ i y_{t-i}$

where is the positive integer and $\lambda _ i$ is the symmetric constant such that $\lambda _ i = \lambda _{-i}$ and $\sum _{i=-m}^ m \lambda _ i = 1$ .

When the data are not available, either an asymmetric moving average is used, or the forecast data are augmented to use the symmetric weight. The X-11 procedure is a complex modification of this moving-average method.

Seasonal ARIMA Model

The regression and moving-average approaches assume that the seasonal component is deterministic and independent of other nonseasonal components. The time series approach is used to handle the stochastic trend and seasonal components.

The general ARIMA model can be written

$\prod _{j=1}^ m \phi _ j(B) \prod _{i=1}^ k (1-B^{s_ i})^{d_ i} \tilde{y}_ t = \theta _0 + \prod _{i=1}^ q \theta _ i(B)\epsilon _ t$

where is the backshift operator and

$\displaystyle \phi _ j(B)$	$\displaystyle =$	$\displaystyle 1 - \phi _1 B - \cdots - \phi _ j B^{p_ j}$
$\displaystyle \theta _ i(B)$	$\displaystyle =$	$\displaystyle 1 - \theta _1 B - \cdots - \theta _ i B^{q_ i}$

and $\tilde{y}_ t = y_ t - \mr {E}(Y_ t)$ if ; otherwise, $\tilde{y}_ t = y_ t$ . The power of , , can be considered as a seasonal factor. Specifically, the Box-Jenkins multiplicative seasonal ARIMA model is written as

$\phi _ p(B) \Phi _ P(B^ s)(1-B)^ d(1-B^ s)^ D \tilde{y}_ t = \theta _ q(B) \Theta _ Q(B^ s) \epsilon _ t$

ARIMA modeling is appropriate for particular time series and requires burdensome computation.

The TSBAYSEA subroutine combines the simple characteristics of the regression approach and time series modeling. The TSBAYSEA and X-11 procedures use the model-based seasonal adjustment. The symmetric weights of the standard X-11 option can be approximated by using the integrated MA form

$(1-B)(1-B^{12})y_ t = \theta (B)\epsilon _ t$

With a fixed value $\phi$ , the TSBAYSEA subroutine is approximated as

$(1-\phi B)(1-B)(1-B^{12})y_ t = \theta (B) \epsilon _ t$

The subroutine is flexible enough to handle trading-day or leap-year effects, the shift of the base observation, and missing values. The TSBAYSEA-type modeling approach has some advantages: it clearly defines the statistical model of the time series; modification of the basic model can be an efficient method of choosing a particular procedure for the seasonal adjustment of a given time series; and the use of the concept of the likelihood provides a minimum AIC model selection approach.

Nonstationary Time Series

The subroutines TSMLOCAR, TSMLOMAR, and TSTVCAR are used to analyze nonstationary time series models. The AIC statistic is extensively used to analyze the locally stationary model.

Locally Stationary AR Model

When the time series is nonstationary, the TSMLOCAR (univariate) and TSMLOMAR (multivariate) subroutines can be employed. The whole span of the series is divided into locally stationary blocks of data, and then the TSMLOCAR and TSMLOMAR subroutines estimate a stationary AR model by using the least squares method on this stationary block. The homogeneity of two different blocks of data is tested by using the AIC.

Given a set of data $\{ y_1,\ldots ,y_ T\}$ , the data can be divided into blocks of sizes $t_1,\ldots ,t_ k$ , where $t_1 + \cdots + t_ k = T$ , and and are unknown. The locally stationary model is fitted to the data

$y_ t = \alpha _0^ i + \sum _{j=1}^{p_ i} \alpha _ j^ i y_{t-j} + \epsilon _ t^ i$

where

$T_{i-1} = \sum _{j=1}^{i-1} t_ j < t \leq T_ i = \sum _{j=1}^ i t_ j \mbox{ for } i=1,\ldots ,k$

where $\epsilon _ t^ i$ is a Gaussian white noise with $\mr {E} \epsilon _ t^ i = 0$ and $\mr {E}(\epsilon _ t^ i)^2 = \sigma _ i^2$ . Therefore, the log-likelihood function of the locally stationary series is

$\ell = -\frac{1}{2}\sum _{i=1}^ k \left[ t_ i\log (2 \pi \sigma _ i^2) + \frac{1}{\sigma _ i^2} \sum _{t=T_{i-1}+1}^{T_ i} \left( y_ t - \alpha _0^ i - \sum _{j=1}^{p_ i} \alpha _ j^ i y_{t-j} \right)^2 \right]$

Given $\alpha _ j^ i$ , $j=0,\ldots ,p_ i$ , the maximum of the log-likelihood function is attained at

$\hat{\sigma }_ i^2 = \frac{1}{t_ i} \sum _{t=T_{i-1}+1}^{T_ i} \left( y_ t - \hat{\alpha }_0^ i - \sum _{j=1}^{p_ i} \hat{\alpha }_ j^ i y_{t-j} \right)^2$

The concentrated log-likelihood function is given by

$\ell ^* = - \frac{T}{2}[1 + \log (2 \pi )] - \frac{1}{2} \sum _{i=1}^ k t_ i \log (\hat{\sigma }_ i^2)$

Therefore, the maximum likelihood estimates, $\hat{\alpha }_ j^ i$ and $\hat{\sigma }_ i^2$ , are obtained by minimizing the following local SSE:

$\mr {SSE} = \sum _{t=T_{i-1}+1}^{T_ i} \left( y_ t - \hat{\alpha }_0^ i - \sum _{j=1}^{p_ i} \hat{\alpha }_ j^ i y_{t-j} \right)^2$

The least squares estimation of the stationary model is explained in the section Least Squares and Householder Transformation.

The AIC for the locally stationary model over the pooled data is written as

$\sum _{i=1}^ k t_ i \log (\hat{\sigma }_ i^2) + 2\sum _{i=1}^ k (p_ i + \mbox{\Emph{intercept}}+1)$

where intercept = 1 if the intercept term $(\alpha _0^ i)$ is estimated; otherwise, intercept = 0. The number of stationary blocks (), the size of each block (), and the order of the locally stationary model is determined by the AIC. Consider the autoregressive model fitted over the block of data, $\{ y_1,\ldots ,y_ T\}$ , and let this model be an AR() process. When additional data, $\{ y_{T+1},\ldots ,y_{T+T_1}\}$ , are available, a new model , an AR() process, is fitted over this new data set, assuming that these data are independent of the previous data. Then AICs for models and are defined as

$\displaystyle \textrm{AIC}_1$	$\displaystyle =$	$\displaystyle T \log (\sigma _1^2) + 2(p_1 + \mbox{\Emph{intercept}} + 1)$
$\displaystyle \textrm{AIC}_2$	$\displaystyle =$	$\displaystyle T_1 \log (\sigma _2^2) + 2(p_2 + \mbox{\Emph{intercept}} + 1)$

The joint model AIC for and is obtained by summation

$\mr {AIC}_ J = \mr {AIC}_1 + \mr {AIC}_2$

When the two data sets are pooled and estimated over the pooled data set, $\{ y_1,\ldots ,y_{T+T_1}\}$ , the AIC of the pooled model is

$\mr {AIC}_ A = (T + T_1) \log (\hat{\sigma }_ A^2) + 2(p_ A + \mbox{\Emph{intercept}} + 1)$

where $\sigma ^2_ A$ is the pooled error variance and is the order chosen to fit the pooled data set.

Decision

If $\mr {AIC}_ J < \mr {AIC}_ A$ , switch to the new model, since there is a change in the structure of the time series.
If $\mr {AIC}_ J\geq \mr {AIC}_ A$ , pool the two data sets, since two data sets are considered to be homogeneous.

If new observations are available, repeat the preceding steps to determine the homogeneity of the data. The basic idea of locally stationary AR modeling is that, if the structure of the time series is not changed, you should use the additional information to improve the model fitting, but you need to follow the new structure of the time series if there is any change.

Time-Varying AR Coefficient Model

Another approach to nonstationary time series, especially those that are nonstationary in the covariance, is time-varying AR coefficient modeling. When the time series is nonstationary in the covariance, the problem in modeling this series is related to an efficient parameterization. It is possible for a Bayesian approach to estimate the model with a large number of implicit parameters of the complex structure by using a relatively small number of hyperparameters.

The TSTVCAR subroutine uses smoothness priors by imposing stochastically perturbed difference equation constraints on each AR coefficient and frequency response function. The variance of each AR coefficient distribution constitutes a hyperparameter included in the state space model. The likelihood of these hyperparameters is computed by the Kalman filter recursive algorithm.

The time-varying AR coefficient model is written

$y_ t = \sum _{i=1}^ m \alpha _{it} y_{t-i} + \epsilon _ t$

where time-varying coefficients $\alpha _{it}$ are assumed to change gradually with time. The following simple stochastic difference equation constraint is imposed on each coefficient:

$\nabla ^ k \alpha _{it} = w_{it}, w_{it} \sim N(0,\tau ^2), i=1,\ldots ,m$

The frequency response function of the AR process is written

$A(f) = 1 - \sum _{j=1}^ m \alpha _{jt} \exp (-2 \pi jif)$

The smoothness of this function can be measured by the th derivative smoothness constraint,

$R_ k = \int _{-1/2}^{1/2} \left| \frac{d^ k A(f)}{df^ k} \right|^2 df = (2 \pi )^{2k} \sum _{j=1}^ m j^{2k} \alpha _{jt}^2$

Then the TSTVCAR call imposes zero and second derivative smoothness constraints. The time-varying AR coefficients are the solution of the following constrained least squares:

$\sum _{t=1}^ T \left( y_ t - \sum _{i=1}^ m \alpha _{it} y_{t-i} \right)^2 + \tau ^2 \sum _{t=1}^ T \sum _{i=1}^ m \left( \nabla ^ k \alpha _{it} \right)^2 + \lambda ^2 \sum _{t=1}^ T \sum _{i=1}^ m i^2 \alpha _{it}^2 + \nu ^2 \sum _{t=1}^ T \sum _{i=1}^ m \alpha _{it}^2$

where $\tau ^2$ , $\lambda ^2$ , and $\nu ^2$ are hyperparameters of the prior distribution.

Using a state space representation, the model is

$\displaystyle \textbf{x}_ t$	$\displaystyle =$	$\displaystyle \bF \textbf{x}_{t-1} + \bG \textbf{w}_ t$
$\displaystyle y_ t$	$\displaystyle =$	$\displaystyle \bH _ t\textbf{x}_ t + \epsilon _ t$

where

$\displaystyle \textbf{x}_ t$	$\displaystyle =$	$\displaystyle (\alpha _{1t},\ldots ,\alpha _{mt},\ldots , \alpha _{1,t-k+1},\ldots ,\alpha _{m,t-k+1})^{\prime }$
$\displaystyle \bH _ t$	$\displaystyle =$	$\displaystyle (y_{t-1},\ldots ,y_{t-m},\ldots ,0,\ldots ,0)$
$\displaystyle \textbf{w}_ t$	$\displaystyle =$	$\displaystyle (w_{1t},\ldots , w_{mt})^{\prime }$
$\displaystyle k$	$\displaystyle =$	$\displaystyle 1: \bF = \bI _ m \bG = \bI _ m$
$\displaystyle k$	$\displaystyle =$	$\displaystyle 2: \bF = \left[ \begin{array}{cc} 2 \bI _ m & -\bI _ m \\ \bI _ m & 0 \end{array} \right] \bG = \left[ \begin{array}{c} \bI _ m \\ 0 \end{array} \right]$
$\displaystyle k$	$\displaystyle =$	$\displaystyle 3: \bF = \left[ \begin{array}{ccc} 3 \bI _ m & -3\bI _ m & \bI _ m \\ \bI _ m & 0 & 0 \\ 0 & \bI _ m & 0 \end{array} \right] \bG = \left[ \begin{array}{c} \bI _ m \\ 0 \\ 0 \end{array} \right]$
$\displaystyle \left[ \begin{array}{c} \textbf{w}_ t \\ \epsilon _ t \end{array} \right]$	$\displaystyle \sim$	$\displaystyle N \left(\mb {0}, \left[ \begin{array}{cc} \tau ^2\bI & 0 \\ 0 & \sigma ^2 \end{array} \right] \right)$

The computation of the likelihood function is straightforward. See the section State Space and Kalman Filter Method for the computation method.

Multivariate Time Series Analysis

The subroutines TSMULMAR, TSMLOMAR, and TSPRED analyze multivariate time series. The periodic AR model, TSPEARS, can also be estimated by using a vector AR procedure, since the periodic AR series can be represented as the covariance-stationary vector autoregressive model.

The stationary vector AR model is estimated and the order of the model (or each variable) is automatically determined by the minimum AIC procedure. The stationary vector AR model is written

$\displaystyle \textbf{y}_ t$	$\displaystyle =$	$\displaystyle \bA _0 + \bA _1 \textbf{y}_{t-1} + \cdots + \bA _ p \textbf{y}_{t-p} + \epsilon _ t$
$\displaystyle \epsilon _ t$	$\displaystyle \sim$	$\displaystyle N(\mb {0}, \Sigma )$

Using the $\bL \bD \bL ^{\prime }$ factorization method, the error covariance is decomposed as

$\Sigma = \bL \bD \bL ^{\prime }$

where $\bL$ is a unit lower triangular matrix and $\bD$ is a diagonal matrix. Then the instantaneous response model is defined as

$\bC \textbf{y}_ t = \bA _0^* + \bA _1^*\textbf{y}_{t-1} + \cdots + \bA _ p^*\textbf{y}_{t-p} + \epsilon _ t^*$

where $\bC = \bL ^{-1}$ , $\bA _ i^* = \bL ^{-1}\bA _ i$ for $i=0,1,\ldots ,p$ , and $\epsilon _ t^* = \bL ^{-1}\epsilon _ t$ . Each equation of the instantaneous response model can be estimated independently, since its error covariance matrix has a diagonal covariance matrix $\bD$ . Maximum likelihood estimates are obtained through the least squares method when the disturbances are normally distributed and the presample values are fixed.

The TSMULMAR subroutine estimates the instantaneous response model. The VAR coefficients are computed by using the relationship between the VAR and instantaneous models.

The general VARMA model can be transformed as an infinite-order MA process under certain conditions.

$\textbf{y}_ t = \mu + \epsilon _ t + \sum _{m=1}^{\infty } \Psi _ m \epsilon _{t-m}$

In the context of the VAR() model, the coefficient $\Psi _ m$ can be interpreted as the -lagged response of a unit increase in the disturbances at time .

$\Psi _ m = \frac{\partial \textbf{y}_{t+m}}{\partial \epsilon ^{\prime }_ t}$

The lagged response on the one-unit increase in the orthogonalized disturbances $\epsilon _ t^*$ is denoted

$\frac{\partial \textbf{y}_{t+m}}{\partial \epsilon _{jt}^*} = \frac{\partial \mr {E}(\textbf{y}_{t+m}|y_{jt},y_{j-1,t},\ldots ,\bX _ t)}{\partial y_{jt}} = \Psi _ m \bL _ j$

where $\bL _ j$ is the th column of the unit triangular matrix $\bL$ and $\bX _ t=[\textbf{y}_{t-1},\ldots ,\textbf{y}_{t-p}]$ . When you estimate the VAR model by using the TSMULMAR call, it is easy to compute this impulse response function.

The MSE of the -step prediction is computed as

$\mr {E}(\textbf{y}_{t+m}-\textbf{y}_{t+m|t})(\textbf{y}_{t+m}-\textbf{y}_{t+m|t})^{\prime } = \Sigma + \Psi _1 \Sigma \Psi ^{\prime }_1 + \cdots + \Psi _{m-1} \Sigma \Psi ^{\prime }_{m-1}$

Note that $\epsilon _ t = \bL \epsilon _ t^*$ . Then the covariance matrix of $\epsilon _ t$ is decomposed

$\Sigma = \sum _{i=1}^ n \bL _ i \bL ^{\prime }_ i d_{ii}$

where $d_{ii}$ is the th diagonal element of the matrix $\bD$ and is the number of variables. The MSE matrix can be written

$\sum _{i=1}^ n d_{ii} \left[ \bL _ i \bL ^{\prime }_ i + \Psi _1 \bL _ i \bL ^{\prime }_ i \Psi ^{\prime }_1 + \cdots + \Psi _{m-1} \bL _ i \bL ^{\prime }_ i \Psi ^{\prime }_{m-1} \right]$

Therefore, the contribution of the th orthogonalized innovation to the MSE is

$\bV _ i = d_{ii} \left[ \bL _ i \bL ^{\prime }_ i + \Psi _1 \bL _ i \bL ^{\prime }_ i \Psi ^{\prime }_1 + \cdots + \Psi _{m-1} \bL _ i \bL ^{\prime }_ i \Psi ^{\prime }_{m-1} \right]$

The th forecast error variance decomposition is obtained from diagonal elements of the matrix $\bV _ i$ .

The nonstationary multivariate series can be analyzed by the TSMLOMAR subroutine. The estimation and model identification procedure is analogous to the univariate nonstationary procedure, which is explained in the section Nonstationary Time Series.

A time series is periodically correlated with period if $\mr {E}y_ t = \mr {E}y_{t+d}$ and $\mr {E}y_ s y_ t = \mr {E}y_{s+d}y_{t+d}$ . Let be autoregressive of period with AR orders $(p_1,\ldots ,p_ d)$ —that is,

$y_ t = \sum _{j=1}^{p_ t} \alpha _{jt}y_{t-j} + \epsilon _ t$

where $\epsilon _ t$ is uncorrelated with mean zero and $\mr {E}\epsilon _ t^2 = \sigma _ t^2$ , $p_ t = p_{t+d}$ , $\sigma _ t^2 = \sigma _{t+d}^2$ , and $\alpha _{jt} = \alpha _{j,t+d} (j=1,\ldots ,p_ t)$ . Define the new variable such that $x_{jt} = y_{j+d(t-1)}$ . The vector series, $\textbf{x}_ t = (x_{1t},\ldots ,x_{dt})^{\prime }$ , is autoregressive of order , where $p = \max _ j\mr {int}((p_ j - j)/d) + 1$ . The TSPEARS subroutine estimates the periodic autoregressive model by using minimum AIC vector AR modeling.

The TSPRED subroutine computes the one-step or multistep forecast of the multivariate ARMA model if the ARMA parameter estimates are provided. In addition, the subroutine TSPRED produces the (intermediate and permanent) impulse response function and performs forecast error variance decomposition for the vector AR model.

Spectral Analysis

The autocovariance function of the random variable is defined as

$C_{YY}(k) = \mr {E}(Y_{t+k} Y_ t)$

where $\mr {E}Y_ t = 0$ . When the real valued process is stationary and its autocovariance is absolutely summable, the population spectral density function is obtained by using the Fourier transform of the autocovariance function

$f(g) = \frac{1}{2 \pi } \sum _{k = -\infty }^{\infty } C_{YY}(k) \exp (-igk) -\pi \leq g \leq \pi$

where $i=\sqrt {-1}$ and $C_{YY}(k)$ is the autocovariance function such that $\sum _{k = -\infty }^{\infty } |C_{YY}(k)| < \infty$ .

Consider the autocovariance generating function

$\gamma (z) = \sum _{k = -\infty }^{\infty } C_{YY}(k) z^ k$

where $C_{YY}(k) = C_{YY}(-k)$ and is a complex scalar. The spectral density function can be represented as

$f(g) = \frac{1}{2 \pi } \gamma (\exp (-ig))$

The stationary ARMA() process is denoted

$\phi (B) y_ t = \theta (B) \epsilon _ t \epsilon _ t \sim (0,\sigma ^2)$

where $\phi (B)$ and $\theta (B)$ do not have common roots. Note that the autocovariance generating function of the linear process $y_ t = \psi (B)\epsilon _ t$ is given by

$\gamma (B) = \sigma ^2\psi (B)\psi (B^{-1})$

For the ARMA() process, $\psi (B) = \frac{\theta (B)}{\phi (B)}$ . Therefore, the spectral density function of the stationary ARMA() process becomes

$f(g) = \frac{\sigma ^2}{2\pi }\left|\frac{\theta (\exp (-ig)) \theta (\exp (ig))}{\phi (\exp (-ig))\phi (\exp (ig))}\right|^2$

The spectral density function of a white noise is a constant.

$f(g) = \frac{\sigma ^2}{2\pi }$

The spectral density function of the AR(1) process $(\phi (B) = 1-\phi _1 B)$ is given by

$f(g) = \frac{\sigma ^2}{2\pi (1-\phi _1 \cos (g) + \phi _1^2)}$

The spectrum of the AR(1) process has its minimum at and its maximum at $g=\pm \pi$ if $\phi _1 < 0$ , while the spectral density function attains its maximum at and its minimum at $g=\pm \pi$ , if $\phi _1>0$ . When the series is positively autocorrelated, its spectral density function is dominated by low frequencies. It is interesting to observe that the spectrum approaches $\frac{\sigma ^2}{4\pi }\frac{1}{1-\cos (g)}$ as $\phi _1\rightarrow 1$ . This relationship shows that the series is difference-stationary if its spectral density function has a remarkable peak near 0.

The spectrum of AR(2) process $(\phi (B)=1-\phi _1 B-\phi _2 B^2)$ equals

$f(g) = \frac{\sigma ^2}{2\pi }\frac{1}{\left\{ -4\phi _2\left[\cos (g)+\frac{\phi _1(1-\phi _2)}{4\phi _2}\right]^2 + \frac{(1+\phi _2)^2(4\phi _2 + \phi _1^2)}{4\phi _2}\right\} }$

Refer to Anderson (1971) for details of the characteristics of this spectral density function of the AR(2) process.

In practice, the population spectral density function cannot be computed. There are many ways of computing the sample spectral density function. The TSBAYSEA and TSMLOCAR subroutines compute the power spectrum by using AR coefficients and the white noise variance.

The power spectral density function of is derived by using the Fourier transformation of $C_{YY}(k)$ .

$f_{YY}(g) = \sum _{k=-\infty }^\infty \exp (-2\pi igk)C_{YY}(k), -\frac{1}{2}\leq g \leq \frac{1}{2}$

where $i=\sqrt {-1}$ and denotes frequency. The autocovariance function can also be written as

$C_{YY}(k) = \int _{-1/2}^{1/2} \exp (2\pi igk)f_{YY}(g)dg$

Consider the following stationary AR() process:

$y_ t - \sum _{i=1}^ p \phi _ i y_{t-i} = \epsilon _ t$

where $\epsilon _ t$ is a white noise with mean zero and constant variance $\sigma ^2$ .

The autocovariance function of white noise $\epsilon _ t$ equals

$C_{\epsilon \epsilon }(k) = \delta _{k0}\sigma ^2$

where $\delta _{k0}=1$ if ; otherwise, $\delta _{k0}=0$ . Therefore, the power spectral density of the white noise is $f_{\epsilon \epsilon }(g) = \sigma ^2$ , $-\frac{1}{2}\leq g \leq \frac{1}{2}$ . Note that, with $\phi _0 = -1$ ,

$C_{\epsilon \epsilon }(k) = \sum _{m=0}^ p \sum _{n=0}^ p \phi _ m\phi _ n C_{YY}(k-m+n)$

Using the following autocovariance function of ,

$C_{YY}(k) = \int _{-1/2}^{1/2} \exp (2\pi igk)f_{YY}(g)dg$

the autocovariance function of the white noise is denoted as

$\displaystyle C_{\epsilon \epsilon }(k)$	$\displaystyle =$	$\displaystyle \sum _{m=0}^ p \sum _{n=0}^ p \phi _ m\phi _ n \int _{-1/2}^{1/2} \exp (2\pi ig(k-m+n))f_{YY}(g)dg$
$\displaystyle$	$\displaystyle =$	$\displaystyle \int _{-1/2}^{1/2} \exp (2\pi igk) \, \left\|1-\sum _{m=1}^ p \phi _ m\exp (-2\pi igm)\right\|^2 f_{YY}(g)dg$

On the other hand, another formula of the $C_{\epsilon \epsilon }(k)$ gives

$C_{\epsilon \epsilon }(k) = \int _{-1/2}^{1/2} \exp (2\pi igk) f_{\epsilon \epsilon }(g)dg$

Therefore,

$f_{\epsilon \epsilon }(g) = \left|1-\sum _{m=1}^ p \phi _ m\exp (-2\pi igm) \right|^2 f_{YY}(g)$

Since $f_{\epsilon \epsilon }(g) = \sigma ^2$ , the rational spectrum of is

$f_{YY}(g) = \frac{\sigma ^2}{\left|1-\sum _{m=1}^ p \phi _ m\exp (-2\pi igm)\right|^2}$

To compute the power spectrum, estimated values of white noise variance $\hat{\sigma }^2$ and AR coefficients $\hat{\phi }_ m$ are used. The order of the AR process can be determined by using the minimum AIC procedure.

Computational Details

Least Squares and Householder Transformation

Consider the univariate AR() process

$y_ t = \alpha _0 + \sum _{i=1}^ p \alpha _ i y_{t-i} + \epsilon _ t$

Define the design matrix $\bX$ .

$\bX = \left[\begin{array}{cccc} 1 & y_ p & \cdots & y_1 \\ \vdots & \vdots & \ddots & \vdots \\ 1 & y_{T-1} & \cdots & y_{T-p} \end{array}\right]$

Let $\textbf{y} = (y_{p+1},\ldots ,y_ n)^{\prime }$ . The least squares estimate, $\hat{\textbf{a}}=(\bX ^{\prime }\bX )^{-1}\bX ^{\prime }\textbf{y}$ , is the approximation to the maximum likelihood estimate of $\textbf{a}=(\alpha _0,\alpha _1,\ldots ,\alpha _ p)$ if $\epsilon _ t$ is assumed to be Gaussian error disturbances. Combining $\bX$ and $\textbf{y}$ as

$\bZ = [\bX \, \vdots \, \textbf{y}]$

the $\bZ$ matrix can be decomposed as

$\bZ = \bQ \bU = \bQ \left[\begin{array}{cc} \bR & \textbf{w}_1 \\ \mb {0} & \textbf{w}_2 \end{array}\right]$

where $\bQ$ is an orthogonal matrix and $\bR$ is an upper triangular matrix, $\textbf{w}_1 = (w_1,\ldots ,w_{p+1})^{\prime }$ , and $\textbf{w}_2 = (w_{p+2},0,\ldots ,0)^{\prime }$ .

$\bQ ^{\prime }\textbf{y} = \left[\begin{array}{c} w_1 \\ w_2 \\ \vdots \\ w_{T-p} \end{array}\right]$

The least squares estimate that uses Householder transformation is computed by solving the linear system

$\bR \textbf{a} = \textbf{w}_1$

The unbiased residual variance estimate is

$\hat{\sigma }^2 = \frac{1}{T-p} \sum _{i=p+2}^{T-p} w_ i^2 = \frac{w_{p+2}^2}{T-p}$

and

$\mr {AIC}=(T-p)\log (\hat{\sigma }^2) + 2(p+1)$

In practice, least squares estimation does not require the orthogonal matrix $\bQ$ . The TIMSAC subroutines compute the upper triangular matrix without computing the matrix $\bQ$ .

Bayesian Constrained Least Squares

Consider the additive time series model

$y_ t = T_ t + S_ t + \epsilon _ t,\epsilon _ t\sim N(0,\sigma ^2)$

Practically, it is not possible to estimate parameters $\textbf{a} = (T_1,\ldots ,T_ T,S_1,\ldots ,S_ T)^{\prime }$ , since the number of parameters exceeds the number of available observations. Let $\nabla _ L^ m$ denote the seasonal difference operator with seasons and degree of ; that is, $\nabla _ L^ m = (1-B^ L)^ m$ . Suppose that . Some constraints on the trend and seasonal components need to be imposed such that the sum of squares of $\nabla ^ k T_ t$ , $\nabla _ L^ m S_ t$ , and $(\sum _{i=0}^{L-1} S_{t-i})$ is small. The constrained least squares estimates are obtained by minimizing

$\sum _{t=1}^ T \left\{ (y_ t-T_ t-S_ t)^2 + d^2\left[s^2(\nabla ^ k T_ t)^2 + (\nabla _ L^ m S_ t)^2 + z^2(S_ t+\cdots +S_{t-L+1})^2\right]\right\}$

Using matrix notation,

$(\textbf{y}-\bM \textbf{a})^{\prime }(\textbf{y}-\bM \textbf{a}) + (\textbf{a}-\textbf{a}_0)^{\prime }\bD ^{\prime }\bD (\textbf{a}-\textbf{a}_0)$

where $\bM = [\bI _ T\, \vdots \, \bI _ T]$ , $\textbf{y}=(y_1,\ldots ,y_ T)^{\prime }$ , and $\textbf{a}_0$ is the initial guess of $\textbf{a}$ . The matrix $\bD$ is a $3T\times 2T$ control matrix in which structure varies according to the order of differencing in trend and season.

$\bD = d \left[\begin{array}{cc} \bE _ m & \mb {0} \\ z\bF & \mb {0} \\ \mb {0} & s\bG _ k \end{array}\right]$

where

$\displaystyle \bE _ m$	$\displaystyle =$	$\displaystyle \bC _ m\otimes \bI _ L, m=1,2,3$
$\displaystyle \bF$	$\displaystyle =$	$\displaystyle \left[\begin{array}{cccc} 1 & 0 & \cdots & 0 \\ 1 & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 1 & \cdots & 1 & 1 \end{array}\right]_{T\times T}$
$\displaystyle \bG _1$	$\displaystyle =$	$\displaystyle \left[\begin{array}{rrrrr} 1 & 0 & 0 & \cdots & 0 \\ -1 & 1 & 0 & \cdots & 0 \\ 0 & -1 & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & -1 & 1 \end{array}\right]_{T\times T}$
$\displaystyle \bG _2$	$\displaystyle =$	$\displaystyle \left[\begin{array}{rrrrrr} 1 & 0 & 0 & 0 & \cdots & 0 \\ -2 & 1 & 0 & 0 & \cdots & 0 \\ 1 & -2 & 1 & 0 & \cdots & 0 \\ 0 & 1 & -2 & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & 1 & -2 & 1 \end{array}\right]_{T\times T}$
$\displaystyle \bG _3$	$\displaystyle =$	$\displaystyle \left[\begin{array}{rrrrrrr} 1 & 0 & 0 & 0 & 0 & \cdots & 0 \\ -3 & 1 & 0 & 0 & 0 & \cdots & 0 \\ 3 & -3 & 1 & 0 & 0 & \cdots & 0 \\ -1 & 3 & -3 & 1 & 0 & \cdots & 0 \\ 0 & -1 & 3 & -3 & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & -1 & 3 & -3 & 1 \end{array}\right]_{T\times T}$

The $n\times n$ matrix $\bC _ m$ has the same structure as the matrix $\bG _ m$ , and $\bI _ L$ is the $L\times L$ identity matrix. The solution of the constrained least squares method is equivalent to that of maximizing the function

$L(\textbf{a}) = \exp \left\{ -\frac{1}{2\sigma ^2}(\textbf{y}-\bM \textbf{a})^{\prime }(\textbf{y}-\bM \textbf{a})\right\} \exp \left\{ -\frac{1}{2\sigma ^2}(\textbf{a}-\textbf{a}_0)^{\prime }\bD ^{\prime }\bD (\textbf{a}-\textbf{a}_0)\right\}$

Therefore, the PDF of the data $\textbf{y}$ is

$f(\textbf{y}|\sigma ^2,\textbf{a}) = \left(\frac{1}{2\pi }\right)^{T/2} \left(\frac{1}{\sigma }\right)^ T \exp \left\{ -\frac{1}{2\sigma ^2}(\textbf{y}-\bM \textbf{a})^{\prime }(\textbf{y}-\bM \textbf{a})\right\}$

The prior PDF of the parameter vector $\textbf{a}$ is

$\pi (\textbf{a}|\bD ,\sigma ^2,\textbf{a}_0) = \left(\frac{1}{2\pi }\right)^ T \left(\frac{1}{\sigma }\right)^{2T}|\bD ^{\prime }\bD | \exp \left\{ -\frac{1}{2\sigma ^2}(\textbf{a}-\textbf{a}_0)^{\prime }\bD ^{\prime }\bD (\textbf{a}-\textbf{a}_0)\right\}$

When the constant is known, the estimate $\hat{\textbf{a}}$ of $\textbf{a}$ is the mean of the posterior distribution, where the posterior PDF of the parameter $\textbf{a}$ is proportional to the function $L(\textbf{a})$ . It is obvious that $\hat{\textbf{a}}$ is the minimizer of $\| \textbf{g}(\textbf{a}|d)\| ^2 = (\tilde{\textbf{y}}-\tilde{\bD }\textbf{a})^{\prime }(\tilde{\textbf{y}}-\tilde{\bD }\textbf{a})$ , where

$\tilde{\textbf{y}} = \left[\begin{array}{c} \textbf{y} \\ \bD \textbf{a}_0 \end{array}\right]$

$\tilde{\bD } = \left[\begin{array}{c} \bM \\ \bD \end{array}\right]$

The value of is determined by the minimum ABIC procedure. The ABIC is defined as

$\mr {ABIC} = T\log \left[\frac{1}{T}\| \textbf{g}(\textbf{a}|d)\| ^2\right] + 2\{ \log [\det (\bD ^{\prime }\bD +\bM ^{\prime }\bM )] - \log [\det (\bD ^{\prime }\bD )]\}$

State Space and Kalman Filter Method

In this section, the mathematical formulas for state space modeling are introduced. The Kalman filter algorithms are derived from the state space model. As an example, the state space model of the TSDECOMP subroutine is formulated.

Define the following state space model:

$\displaystyle \textbf{x}_ t$	$\displaystyle =$	$\displaystyle \bF \textbf{x}_{t-1} + \bG \textbf{w}_ t$
$\displaystyle y_ t$	$\displaystyle =$	$\displaystyle \bH _ t\textbf{x}_ t + \epsilon _ t$

where $\epsilon _ t\sim N(0,\sigma ^2)$ and $\textbf{w}_ t\sim N(\mb {0},\bQ )$ . If the observations, $(y_1,\ldots ,y_ T)$ , and the initial conditions, $\textbf{x}_{0|0}$ and $\bP _{0|0}$ , are available, the one-step predictor $(\textbf{x}_{t|t-1})$ of the state vector $\textbf{x}_ t$ and its mean square error (MSE) matrix $\bP _{t|t-1}$ are written as

$\textbf{x}_{t|t-1} = \bF \textbf{x}_{t-1|t-1}$

$\bP _{t|t-1} = \bF \bP _{t-1|t-1}\bF ^{\prime } + \bG \bQ \bG ^{\prime }$

Using the current observation, the filtered value of $\textbf{x}_ t$ and its variance $\bP _{t|t}$ are updated.

$\textbf{x}_{t|t} = \textbf{x}_{t|t-1} + \bK _ t e_ t$

$\bP _{t|t} = (\bI - \bK _ t \bH _ t)\bP _{t|t-1}$

where $e_ t = y_ t - \bH _ t\textbf{x}_{t|t-1}$ and $\bK _ t = \bP _{t|t-1}\bH ^{\prime }_ t[\bH _ t \bP _{t|t-1}\bH ^{\prime }_ t + \sigma ^2\bI ]^{-1}$ . The log-likelihood function is computed as

$\ell = -\frac{1}{2}\sum _{t=1}^ T \log (2\pi v_{t|t-1}) - \sum _{t=1}^ T \frac{e_ t^2}{2v_{t|t-1}}$

where $v_{t|t-1}$ is the conditional variance of the one-step prediction error .

Consider the additive time series decomposition

$y_ t = T_ t + S_ t + T\! D_ t + u_ t + \textbf{x}^{\prime }_ t\beta _ t + \epsilon _ t$

where $\textbf{x}_ t$ is a $(K\times 1)$ regressor vector and $\beta _ t$ is a $(K\times 1)$ time-varying coefficient vector. Each component has the following constraints:

$\displaystyle \nabla ^ k T_ t$	$\displaystyle =$	$\displaystyle w_{1t},w_{1t}\sim N(0,\tau _1^2)$
$\displaystyle \nabla _ L^ m S_ t$	$\displaystyle =$	$\displaystyle w_{2t},w_{2t}\sim N(0,\tau _2^2)$
$\displaystyle u_ t$	$\displaystyle =$	$\displaystyle \sum _{i=1}^ p \alpha _ i u_{t-i} + w_{3t}, w_{3t}\sim N(0,\tau _3^2)$
$\displaystyle \beta _{jt}$	$\displaystyle =$	$\displaystyle \beta _{j,t-1} + w_{3+j,t}, w_{3+j,t}\sim N(0,\tau _{3+j}^2),j=1,\cdots ,K$
$\displaystyle \sum _{i=1}^7 \gamma _{it}T\! D_ t(i)$	$\displaystyle =$	$\displaystyle \sum _{i=1}^6 \gamma _{it}(T\! D_ t(i)-T\! D_ t(7))$
$\displaystyle \gamma _{it}$	$\displaystyle =$	$\displaystyle \gamma _{i,t-1}$

where $\nabla ^ k = (1-B)^ k$ and $\nabla _ L^ m = (1-B^ L)^ m$ . The AR component is assumed to be stationary. The trading-day component $T\! D_ t(i)$ represents the number of the th day of the week in time . If , and (monthly data),

$\displaystyle T_ t$	$\displaystyle =$	$\displaystyle 3T_{t-1} - 3T_{t-2} + T_{t-3} + w_{1t}$
$\displaystyle \sum _{i=0}^{11} S_{t-i}$	$\displaystyle =$	$\displaystyle w_{2t}$
$\displaystyle u_ t$	$\displaystyle =$	$\displaystyle \sum _{i=1}^3 \alpha _ i u_{t-i} + w_{3t}$

The state vector is defined as

$\textbf{x}_ t = (T_ t,T_{t-1},T_{t-2},S_ t,\ldots ,S_{t-11},u_ t,u_{t-1},u_{t-2}, \gamma _{1t},\ldots ,\gamma _{6t})^{\prime }$

The matrix $\bF$ is

$\bF = \left[\begin{array}{cccc} \bF _1 & 0 & 0 & 0 \\ 0 & \bF _2 & 0 & 0 \\ 0 & 0 & \bF _3 & 0 \\ 0 & 0 & 0 & \bF _4 \end{array}\right]$

where

$\bF _1 = \left[\begin{array}{rrr} 3 & -3 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{array}\right]$

$\bF _2 = \left[\begin{array}{rr} -\mb {1}^{\prime } & -1 \\ \bI _{10} & 0 \end{array}\right]$

$\bF _3 = \left[\begin{array}{ccc} \alpha _1 & \alpha _2 & \alpha _3 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{array}\right]$

$\bF _4 = \bI _6$

$\mb {1}^{\prime } = (1,1,\ldots ,1)$

The matrix can be denoted as

$G = \left[\begin{array}{ccc} \textbf{g}_1 & 0 & 0 \\ 0 & \textbf{g}_2 & 0 \\ 0 & 0 & \textbf{g}_3 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array}\right]$

where

$\textbf{g}_1 = \textbf{g}_3 = \left[\begin{array}{ccc} 1 & 0 & 0 \end{array}\right]^{\prime }$

$\textbf{g}_2 = \left[\begin{array}{cccccc} 1 & 0 & 0 & 0 & 0 & 0 \end{array}\right]^{\prime }$

Finally, the matrix $\bH _ t$ is time-varying,

$\bH _ t = \left[\begin{array}{cccccccccccccccccc} 1 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & \textbf{h}^{\prime }_ t \end{array}\right]$

where

$\displaystyle \textbf{h}_ t$	$\displaystyle =$	$\displaystyle \left[\begin{array}{cccccc} D_ t(1) & D_ t(2) & D_ t(3) & D_ t(4) & D_ t(5) & D_ t(6) \end{array}\right]^{\prime }$
$\displaystyle D_ t(i)$	$\displaystyle =$	$\displaystyle T\! D_ t(i)-T\! D_ t(7),i=1,\ldots ,6$

Missing Values

The TIMSAC subroutines skip any missing values at the beginning of the data set. When the univariate and multivariate AR models are estimated via least squares (TSMLOCAR, TSMLOMAR, TSUNIMAR, TSMULMAR, and TSPEARS), there are three options available; that is, MISSING=0, MISSING=1, or MISSING=2. When the MISSING=0 (default) option is specified, the first contiguous observations with no missing values are used. The MISSING=1 option specifies that only nonmissing observations should be used by ignoring the observations with missing values. If the MISSING=2 option is specified, the missing values are filled with the sample mean. The least squares estimator with the MISSING=2 option is biased in general.

The BAYSEA subroutine assumes the same prior distribution of the trend and seasonal components that correspond to the missing observations. A modification is made to skip the components of the vector $\textbf{g}(\textbf{a}|d)$ that correspond to the missing observations. The vector $\textbf{g}(\textbf{a}|d)$ is defined in the section Bayesian Constrained Least Squares. In addition, the TSBAYSEA subroutine considers outliers as missing values. The TSDECOMP and TSTVCAR subroutines skip the Kalman filter updating equation when the current observation is missing.

ISM TIMSAC Packages

A description of each TIMSAC package follows. Each description includes a list of the programs provided in the TIMSAC version.

TIMSAC-72

The TIMSAC-72 package analyzes and controls feedback systems (for example, a cement kiln process). Univariate- and multivariate-AR models are employed in this original TIMSAC package. The final prediction error (FPE) criterion is used for model selection.

AUSPEC estimates the power spectrum by the Blackman-Tukey procedure.
AUTCOR computes autocovariance and autocorrelation.
DECONV computes the impulse response function.
FFTCOR computes autocorrelation and crosscorrelation via the fast Fourier transform.
FPEAUT computes AR coefficients and FPE for the univariate AR model.
FPEC computes AR coefficients and FPE for the control system or multivariate AR model.
MULCOR computes multiple covariance and correlation.
MULNOS computes relative power contribution.
MULRSP estimates the rational spectrum for multivariate data.
MULSPE estimates the cross spectrum by Blackman-Tukey procedure.
OPTDES performs optimal controller design.
OPTSIM performs optimal controller simulation.
RASPEC estimates the rational spectrum for univariate data.
SGLFRE computes the frequency response function.
WNOISE performs white noise simulation.

TIMSAC-74

The TIMSAC-74 package estimates and forecasts univariate and multivariate ARMA models by fitting the canonical Markovian model. A locally stationary autoregressive model is also analyzed. Akaike’s information criterion (AIC) is used for model selection.

AUTARM performs automatic univariate ARMA model fitting.
BISPEC computes bispectrum.
CANARM performs univariate canonical correlation analysis.
CANOCA performs multivariate canonical correlation analysis.
COVGEN computes the covariance from gain function.
FRDPLY plots the frequency response function.
MARKOV performs automatic multivariate ARMA model fitting.
NONST estimates the locally stationary AR model.
PRDCTR performs ARMA model prediction.
PWDPLY plots the power spectrum.
SIMCON performs optimal controller design and simulation.
THIRMO computes the third-order moment.

TIMSAC-78

The TIMSAC-78 package uses the Householder transformation to estimate time series models. This package also contains Bayesian modeling and the exact maximum likelihood estimation of the ARMA model. Minimum AIC or Akaike Bayesian information criterion (ABIC) modeling is extensively used.

BLOCAR estimates the locally stationary univariate AR model by using the Bayesian method.
BLOMAR estimates the locally stationary multivariate AR model by using the Bayesian method.
BSUBST estimates the univariate subset regression model by using the Bayesian method.
EXSAR estimates the univariate AR model by using the exact maximum likelihood method.
MLOCAR estimates the locally stationary univariate AR model by using the minimum AIC method.
MLOMAR estimates the locally stationary multivariate AR model by using the minimum AIC method.
MULBAR estimates the multivariate AR model by using the Bayesian method.
MULMAR estimates the multivariate AR model by using the minimum AIC method.
NADCON performs noise adaptive control.
PERARS estimates the periodic AR model by using the minimum AIC method.
UNIBAR estimates the univariate AR model by using the Bayesian method.
UNIMAR estimates the univariate AR model by using the minimum AIC method.
XSARMA estimates the univariate ARMA model by using the exact maximum likelihood method.

In addition, the following test subroutines are available: TSSBST, TSWIND, TSROOT, TSTIMS, and TSCANC.

TIMSAC-84

The TIMSAC-84 package contains the Bayesian time series modeling procedure, the point process data analysis, and the seasonal adjustment procedure.

ADAR estimates the amplitude dependent AR model.
BAYSEA performs Bayesian seasonal adjustments.
BAYTAP performs Bayesian tidal analysis.
DECOMP performs time series decomposition analysis by using state space modeling.
EPTREN estimates intensity rates of either the exponential polynomial or exponential Fourier series of the nonstationary Poisson process model.
LINLIN estimates linear intensity models of the self-exciting point process with another process input and with cyclic and trend components.
LINSIM performs simulation of the point process estimated by the subroutine LINLIN.
LOCCAR estimates the locally constant AR model.
MULCON performs simulation, control, and prediction of the multivariate AR model.
NONSPA performs nonstationary spectrum analysis by using the minimum Bayesian AIC procedure.
PGRAPH performs graphical analysis for point process data.
PTSPEC computes periodograms of point process data with significant bands.
SIMBVH performs simulation of bivariate Hawkes’ mutually exciting point process.
SNDE estimates the stochastic nonlinear differential equation model.
TVCAR estimates the time-varying AR coefficient model by using state space modeling.

Refer to Kitagawa and Akaike (1981) and Ishiguro (1987) for more information about TIMSAC programs.