SAS Macros and Functions


DFTEST Macro

The %DFTEST macro performs the Dickey-Fuller unit root test. You can use the %DFTEST macro to decide whether a time series is stationary and to determine the order of differencing required for the time series analysis of a nonstationary series.

Most time series analysis methods require that the series to be analyzed is stationary. However, many economic time series are nonstationary processes. The usual approach to this problem is to difference the series. A time series that can be made stationary by differencing is said to have a unit root. For more information, see the discussion of this issue in the section Getting Started: ARIMA Procedure of ChapterĀ 7: The ARIMA Procedure.

The Dickey-Fuller test is a method for testing whether a time series has a unit root. The %DFTEST macro tests the hypothesis H$_{0}$: "The time series has a unit root" versus H$_{a}$: "The time series is stationary" based on tables provided in Dickey (1976); Dickey, Hasza, and Fuller (1984). The test can be applied for a simple unit root with lag 1, or for seasonal unit roots at lag 2, 4, or 12.

Note that the %DFTEST macro has been superseded by the PROC ARIMA stationarity tests. See ChapterĀ 7: The ARIMA Procedure, for details.

Syntax

The %DFTEST macro has the following form:

  • %DFTEST ( SAS-data-set, variable < , options > );

The first argument, SAS-data-set, specifies the name of the SAS data set that contains the time series variable to be analyzed.

The second argument, variable, specifies the time series variable name to be analyzed.

The first two arguments are required. The following options can be used with the %DFTEST macro. Options must follow the required arguments and are separated by commas.

AR=n

specifies the order of autoregressive model fit after any differencing specified by the DIF= and DLAG= options. The default is AR=3.

DIF=( differencing-list )

specifies the degrees of differencing to be applied to the series. The differencing list is a list of positive integers separated by commas and enclosed in parentheses. For example, DIF=(1,12) specifies that the series be differenced once at lag 1 and once at lag 12. For more details, see the section IDENTIFY Statement in ChapterĀ 7: The ARIMA Procedure.

If the option DIF=( d $_{1}$, ${\cdots }$, d $_{k}$ ) is specified, the series analyzed is ${(1-B^{d_{1}}){\cdots }(1-B^{d_{k}})Y_{t}}$, where ${Y_{t}}$ is the variable specified, and $B$ is the backshift operator defined by ${{B}Y_{t} = Y_{t-1}}$.

DLAG=1 | 2 | 4 | 12

specifies the lag to be tested for a unit root. The default is DLAG=1.

OUT=SAS-data-set

writes residuals to an output data set.

OUTSTAT=SAS-data-set

writes the test statistic, parameter estimates, and other statistics to an output data set.

TREND=0 | 1 | 2

specifies the degree of deterministic time trend included in the model. TREND=0 includes no deterministic term and assumes the series has a zero mean. TREND=1 includes an intercept term. TREND=2 specifies an intercept and a linear time trend term. The default is TREND=1. TREND=2 is not allowed with DLAG=2, 4, or 12.

Results

The computed p-value is returned in the macro variable &DFTEST. If the p-value is less than 0.01 or larger than 0.99, the macro variable &DFTEST is set to 0.01 or 0.99, respectively. (The same value is given in the macro variable &DFPVALUE returned by the %DFPVALUE macro, which is used by the %DFTEST macro to compute the p-value.)

Results can be stored in SAS data sets with the OUT= and OUTSTAT= options.

Minimum Observations

The minimum number of observations required by the %DFTEST macro depends on the value of the DLAG= option. Let s be the sum of the differencing orders specified by the DIF= option, let t be the value of the TREND= option, and let p be the value of the AR= option. The minimum number of observations required is as follows:

DLAG=

Minimum Observations

1

${1+p+s+\max ( 9, p+t+2 ) }$

2

${2+p+s+\max ( 6, p+t+2) }$

4

${4+p+s+\max ( 4, p+t+2 ) }$

12

${12+p+s+\max ( 12, p+t+2 ) }$

Observations are not used if they have missing values for the series or for any lag or difference used in the autoregressive model.