To implement duplicate-data
checking, perform the following steps:
-
Right-click the staging
transformation and select
Properties.
-
On the
Properties dialog
box, select the
Staging Parameters tab.
-
On the
Duplicate
Checking page, ensure that the
Enable duplicate
checking field is set to
Yes
.
This setting enables you to specify the parameters that govern the
duplicate-data checking process.
Note: The SNMP adapter requires
that duplicate checking be turned on. This setting is necessary because
neither method of gathering raw data for SNMP (HPNNM and RRDtool)
can ensure that only the most recent raw data is saved. Therefore,
invoking the duplicate-data checking code of SAS IT Resource Management
is the only way to determine what is new data and what is duplicate
data.
If you do not want to
implement duplicate-data checking, set the
Enable duplicate
checking field to
No
. This setting
makes the duplicate-data checking parameters unavailable.
-
Specify the following
parameters:
-
Duplicate checking
option
specifies how duplicate
data is handled. Select one of the following options:
stops processing if
duplicate data is encountered.
continues processing
while rejecting duplicate data if it is encountered. This is the default
value for this parameter.
Note: For best results, the value
for the
Duplicate checking option parameter
for the SNMP adapter should always be set to
Discard
.
continues processing
and accepts duplicate data if it is encountered.
Note: Duplicate-data checking macros
are designed to prevent the same data from being processed into the
IT data mart twice. However, sometimes you might need to backload
data.
Backloading data means to process
data that is in a datetime range for which the permanent control data
sets have already recorded machine or system data. (For example, you
might need to process data into one or more tables that you did not
use earlier. You might also need to process data into one or more
tables that you accidentally purged or deleted.) Make sure you restore
the
Duplicate checking option setting to
its original value after you finish the backloading task.
-
identifies the SAS
variable that is used to denote the origin of each incoming record.
Note: This parameter is visible
only for the CSV, RRDtool, and user-written adapters.
-
specifies the maximum
time gap (or interval) that is allowed between the timestamps on any
two consecutive records from the same system or machine. If the interval
between the timestamp values exceeds the value of this parameter,
then an observation with the new time range is created in the control
data set. This is referred to as a
gap in
the data.
The value for this
parameter must be provided in the format
hh:mm,
where
hh represents hours and
mm represents
minutes. For example, to specify an interval of 14 minutes, use
INT=0:14
.
To specify an interval of 1 hour and 29 minutes, use
INT=1:29
.
-
specifies the number
of weeks for which control data are kept. Because this value represents
the number of Sundays between two dates, a value of 2 results in a
maximum retention period of 20 days. This value must be an integer.
-
The
REPORT parameter
specifies whether to display the duplicate-data checking messages
in the SAS log or to save the messages in an audit table. If set
to
Yes
, this parameter displays all
the messages from duplicate-data checking in the SAS log. If set
to
No
, the duplicate-data checking
messages are saved in an audit data table that is stored in the staging
library. The name of the audit table is
sourceAUDIT
(where
source is the 3-character
data source code).
Note: If you are monitoring very
high numbers of resources, setting this option to
NO
can
be beneficial. Eliminating the report reduces CPU consumption, shortens
elapsed time, and makes the SAS log more manageable.
Note: Prior to SAS IT Resource
Management 3.3, you were required to create catalog entries or files
in the MXG source library of your operating system in order to handle
duplicate-data checking. Although these members or files are no longer
necessary, if they exist, SAS IT Resource Management continues to
honor them. However, it is preferable to manage duplicate-data checking
by specifying the appropriate values on the staging transformation.