Some data files group time series data with respect to cross-section identifiers; for example, International Financial Statistics files, distributed by IMF, group data with respect to countries (COUNTRY). Within each country, data are further grouped by Control Source Code (CSC), Partner Country Code (PARTNER), and Version Code (VERSION).
If a data file contains cross-section identifiers, the DATASOURCE procedure adds them to the output data set as BY variables. For example, the data set in Table 12.2 contains three cross sections:
Cross-section one is identified by (COUNTRY=’112’ CSC=’F’ PARTNER=’ ’ VERSION=’Z’).
Cross-section two is identified by (COUNTRY=’146’ CSC=’F’ PARTNER=’ ’ VERSION=’Z’).
Cross-section three is identified by (COUNTRY=’158’ CSC=’F’ PARTNER=’ ’ VERSION=’Z’).
Table 12.2: The Form of a SAS Data Set Containing BY Variables
BY |
Time ID |
Time Series |
||||
Variables |
Variable |
Variables |
||||
COUNTRY |
CSC |
PARTNER |
VERSION |
DATE |
EFFEXR |
EXRINDEX |
112 |
F |
Z |
SEP1987 |
9326 |
12685 |
|
112 |
F |
Z |
OCT1987 |
9393 |
12813 |
|
112 |
F |
Z |
NOV1987 |
9626 |
13694 |
|
112 |
F |
Z |
DEC1987 |
9675 |
14099 |
|
112 |
F |
Z |
JAN1988 |
9581 |
13910 |
|
112 |
F |
Z |
FEB1988 |
9493 |
13549 |
|
146 |
F |
Z |
SEP1987 |
12046 |
16192 |
|
146 |
F |
Z |
OCT1987 |
12067 |
16266 |
|
146 |
F |
Z |
NOV1987 |
12558 |
17596 |
|
146 |
F |
Z |
DEC1987 |
12759 |
18301 |
|
146 |
F |
Z |
JAN1988 |
12642 |
18082 |
|
146 |
F |
Z |
FEB1988 |
12409 |
17470 |
|
158 |
F |
Z |
SEP1987 |
13841 |
16558 |
|
158 |
F |
Z |
OCT1987 |
13754 |
16499 |
|
158 |
F |
Z |
NOV1987 |
14222 |
17505 |
|
158 |
F |
Z |
DEC1987 |
14768 |
18423 |
|
158 |
F |
Z |
JAN1988 |
14933 |
18565 |
|
158 |
F |
Z |
FEB1988 |
14915 |
18331 |
Note that the data sets in Table 12.1 and Table 12.2 use two different ways of representing time series data for three different countries: the United Kingdom (COUNTRY=’112’), Switzerland (COUNTRY=’146’), and Japan (COUNTRY=’158’). The first representation (Table 12.1) incorporates each country’s name into the series names, while the second representation (Table 12.2) represents countries as different cross sections by using the BY variable named COUNTRY. See "Time Series and SAS Data Sets" in Chapter 3: Working with Time Series Data.