This section outlines the use of the TIMEDATA procedure and gives a cursory description of some of the analysis techniques that you can perform on time-stamped transactional data.
Given an input data set that contains numerous transaction variables recorded over time at no specific frequency, the TIMEDATA procedure can form time series as follows:
PROC TIMEDATA DATA=<input-data-set> OUT=<output-data-set>; BY <list-of-BY-variables>; ID <time-ID-variable> INTERVAL=<frequency> ACCUMULATE=<statistic>; VAR <time-series-variables>; /* programming statements */ RUN;
The TIMEDATA procedure forms time series from the input time-stamped transactional data. It can provide results in output data sets or in other output formats by using the Output Delivery System (ODS).
Time-stamped transactional data are recorded at no fixed interval. Analysts often want to use time series analysis techniques that require fixed-time intervals. Therefore, the transactional data must be accumulated to form a fixed-interval time series, such as daily, weekly, or monthly.
Suppose that a bank wants to analyze the transactions that are associated with each of its customers over time. Further, suppose
that the data set Work.Transactions
contains four variables that are related to these transactions: Customer
, Date
, Withdrawals
, and Deposits
. The following examples illustrate possible ways to analyze these transactions by using the TIMEDATA procedure.
The following TIMEDATA procedure statements accumulate the time-stamped transactional data to form a daily time series based
on the accumulated daily totals of each type of transaction (Withdrawals
and Deposits
):
proc timedata data=transactions out=timeseries outarray=arrays; by customer; id date interval=day accumulate=total; var withdrawals deposits; outarrays balance; balance[1] = deposits[1] - withdrawals[1]; do t = 2 to _LENGTH_; balance[t] = balance[t-1] + (deposits[t] - withdrawals[t]); end; run;
The OUT=TIMESERIES option specifies that the resulting time series data for each customer are to be stored in the data set
Work.Transactions
. The OUTARRAY=ARRAYS option specifies that the resulting time series data along with a newly created variable, Balance
, are to be stored in the data set Work.Arrays
. The INTERVAL=DAY option specifies that the transactions are to be accumulated on a daily basis. The ACCUMULATE=TOTAL option
specifies that the sum of the transactions is to be calculated. After the transactional data are accumulated into a time series
format, many of the procedures provided with SAS/ETS software can be used to analyze the resulting time series data.
For example, the following statements use the ARIMA procedure to model and forecast each customer’s balance data by using an ARIMA(1,0,0)(0,1,0) model (where the number of seasons is s=7 days in a week):
proc arima data=arrays; by customer; identify var=balance(7) noprint; estimate p=(1) outest=estimates noprint; forecast id=date interval=day out=forecasts; quit;
The OUTEST=ESTIMATES data set contains the parameter estimates of the model specified. The OUT=FORECASTS data set contains forecasts based on the model specified. See the SAS/ETS ARIMA procedure for more detail.
By default, the TIMEDATA procedure produces no printed output.