The ARIMA Procedure

Example 7.4 An Intervention Model for Ozone Data

This example fits an intervention model to ozone data as suggested by Box and Tiao (1975). Notice that the response variable, OZONE, and the innovation, X1, are seasonally differenced. The final model for the differenced data is a multiple regression model with a moving-average structure assumed for the residuals.

The model is fit by maximum likelihood. The seasonal moving-average parameter and its standard error are fairly sensitive to which method is chosen to fit the model (Ansley and Newbold, 1980; Davidson, 1981); thus, fitting the model by the unconditional or conditional least squares method produces somewhat different estimates for these parameters.

Some missing values are appended to the end of the input data to generate additional values for the independent variables. Since the independent variables are not modeled, values for them must be available for any times at which predicted values are desired. In this case, predicted values are requested for 12 periods beyond the end of the data. Thus, values for X1, WINTER, and SUMMER must be given for 12 periods ahead.

The following statements read in the data and compute dummy variables for use as intervention inputs:

title1 'Intervention Data for Ozone Concentration';
title2 '(Box and Tiao, JASA 1975 P.70)';
data air;
   input ozone @@;
   label ozone  = 'Ozone Concentration'
         x1     = 'Intervention for post 1960 period'
         summer = 'Summer Months Intervention'
         winter = 'Winter Months Intervention';
   date = intnx( 'month', '31dec1954'd, _n_ );
   format date monyy.;
   month = month( date );
   year = year( date );
   x1 = year >= 1960;
   summer = ( 5 < month < 11 ) * ( year > 1965 );
   winter = ( year > 1965 ) - summer;
datalines;
2.7  2.0  3.6  5.0  6.5  6.1  5.9  5.0  6.4  7.4  8.2  3.9
4.1  4.5  5.5  3.8  4.8  5.6  6.3  5.9  8.7  5.3  5.7  5.7
3.0  3.4  4.9  4.5  4.0  5.7  6.3  7.1  8.0  5.2  5.0  4.7
3.7  3.1  2.5  4.0  4.1  4.6  4.4  4.2  5.1  4.6  4.4  4.0

   ... more lines ...   

The following statements produce Output 7.4.1 through Output 7.4.3:

proc arima data=air;

   /* Identify and seasonally difference ozone series */
   identify var=ozone(12)
            crosscorr=( x1(12) summer winter ) noprint;

   /* Fit a multiple regression with a seasonal MA model */
   /*     by the maximum likelihood method               */
   estimate q=(1)(12) input=( x1 summer winter )
            noconstant method=ml;

   /* Forecast */
   forecast  lead=12 id=date interval=month;

run;

The ESTIMATE statement results are shown in Output 7.4.1 and Output 7.4.2.

Output 7.4.1: Parameter Estimates

Intervention Data for Ozone Concentration
(Box and Tiao, JASA 1975 P.70)

The ARIMA Procedure

Maximum Likelihood Estimation
Parameter Estimate Standard
Error
t Value Approx
Pr > |t|
Lag Variable Shift
MA1,1 -0.26684 0.06710 -3.98 <.0001 1 ozone 0
MA2,1 0.76665 0.05973 12.83 <.0001 12 ozone 0
NUM1 -1.33062 0.19236 -6.92 <.0001 0 x1 0
NUM2 -0.23936 0.05952 -4.02 <.0001 0 summer 0
NUM3 -0.08021 0.04978 -1.61 0.1071 0 winter 0

Variance Estimate 0.634506
Std Error Estimate 0.796559
AIC 501.7696
SBC 518.3602
Number of Residuals 204



Output 7.4.2: Model Summary

Model for variable ozone
Period(s) of Differencing 12

Moving Average Factors
Factor 1: 1 + 0.26684 B**(1)
Factor 2: 1 - 0.76665 B**(12)

Input Number 1
Input Variable x1
Period(s) of Differencing 12
Overall Regression Factor -1.33062



The FORECAST statement results are shown in Output 7.4.3.

Output 7.4.3: Forecasts

Forecasts for variable ozone
Obs Forecast Std Error 95% Confidence Limits
217 1.4205 0.7966 -0.1407 2.9817
218 1.8446 0.8244 0.2287 3.4604
219 2.4567 0.8244 0.8408 4.0725
220 2.8590 0.8244 1.2431 4.4748
221 3.1501 0.8244 1.5342 4.7659
222 2.7211 0.8244 1.1053 4.3370
223 3.3147 0.8244 1.6989 4.9306
224 3.4787 0.8244 1.8629 5.0946
225 2.9405 0.8244 1.3247 4.5564
226 2.3587 0.8244 0.7429 3.9746
227 1.8588 0.8244 0.2429 3.4746
228 1.2898 0.8244 -0.3260 2.9057