CSV — Accessing Raw Data

Preparing CSV Data for SAS IT Resource Management

CSV (Comma Separated Values) files are flat files that contain rows of data. The values (or columns) on each row are separated by some sort of delimiter (CSV has come to mean any delimited data, not just comma delimited data). The delimiter is either a single character or a string of characters.

Examples of Delimiters

Type of Delimiter	Delimited Data
Comma Delimited (,)	7.8048232979 ,68.79 ,7110.58
Bang Delimited (!)	7.8048232979!68.79!7110.58
String Delimited (abc)	7.8048232979abc68.79abc7110.58

Note: In each of the preceding table entries, there are three sets of values.

CSV files often have a header line that contains the names of the columns. The names must also be delimited like the data.

Examples of Delimiters in Header Rows

Type of Delimiter	Delimited Header Rows
Comma Delimited (,)	Column 1, Column 2, Column 3
Bang Delimited (!)	Column 1!Column 2!Column 3
String Delimited (abc)	Column 1abcColumn 2abcColumn 3

Note: The header row can be on any row although it is usually the first row. The data rows begin any time after the header row, but they usually begin on the second row. Values that contain, or could contain, the delimiter are enclosed in quotation marks (“ ”) to indicate that the delimiter should be ignored.

The CSV adapter can read most styles of CSV files. There are a set of parameters on the Staging Parameter tab of the CSV staging transformation that will enable you to describe the following data:

what the delimiters are
whether there can be embedded delimiters
what row the header is on
what row the data starts on

The adapter can also read a CSV file that does not have a header row. In that case, the columns are named: Column1-Columnn.

The CSV adapter can read multiple CSV files (all the files in a specified directory) at the same time as long as the files have the same structures. In other words, if they have a header row, then the header rows must be the same. If they do not have a header row, then the files must have the same number of columns. If these conditions are not satisfied, then the adapter will not be able to read the data. In addition, an error will be displayed when you try to create the new staged table. When reading multiple CSV files, the adapter creates a single staged table as output with all the data combined.

Because this data is being read by SAS IT Resource Management, the data is expected to include a date/time or time stamp. If there is a column in the CSV file that has a header of DATETIME, then the adapter uses that column as the standard IT Resource Management DATETIME column. (The case of DATETIME is not significant.) Otherwise, a DATE column and a TIME column are combined to create the DATETIME column. If none of these headers are found in the CSV file, then the DATETIME column is created from the first encountered columns that have a datetime, date, or time value. If no date or time columns are found in the data, then the DATETIME is set to missing.

Accessing CSV Files That Contain Data Encoded in UTF-8

If your CSV rawdata files contain data that is encoded in UTF-8, then you must set the encoding on the staging LIBNAME statement in order to access them. To do this, perform the following steps:

In the SAS IT Resource Management client, navigate to the staging library for your CSV staged tables and right-click that library.
Select Properties.
Select the Options tab.
Click Advanced Options.
Select the Options for any host tab. In the Encoding to use when reading a file field, enter utf8.
Click OK.