When a data set contains a time ID variable with corrupted, missing, or duplicate values, PROC TIMEID can help isolate and identify these problematic observations. For a data set with a small number of ID variable anomalies and a known time interval, a graphical depiction of the problem areas can be created using the following statements:
proc timeid data=<input-dataset> plot=values; id <time-ID-variable> interval=<frequency>; run;
For larger data sets whose quality is unknown, it can be useful to get a general overview of the relative number of observations with problematic time ID values. The following statements graphically summarize the prevalence of anomalous time ID values:
proc timeid data=<input-dataset> plot=(intervalcounts offsets spans); id <time-ID-variable> interval=<frequency>; run;
When prior knowledge of the time interval that separates observations is incomplete, PROC TIMEID can be used to infer the interval by omitting the INTERVAL= option from the ID statement as in the following statements:
proc timeid data=<input-dataset> outinterval=<output-dataset>; id <time-ID-variable>; run;