The VARIOGRAM Procedure

Example 102.2 An Anisotropic Case Study with Surface Trend in the Data

This example shows how to examine data for nonrandom surface trends and anisotropy. You use simulated data where the variable is atmospheric ozone (O$_3$) concentrations measured in Dobson units (DU). The coordinates are offsets from a point in the southwest corner of the measurement area, with the east and north distances in units of kilometers (km). You work with the ozoneSet data set that contains 300 measurements in a square area of 100 km $\times $ 100 km.

The following statements read the data set:

title 'Semivariogram Analysis in Anisotropic Case With Trend Removal';

data ozoneSet;
   input East North Ozone @@;
   datalines;
34.9 68.2 286  39.2 12.5 270  44.4 37.7 275  90.5 27.0 282 
91.1 40.8 285  98.6 61.6 294  61.8 26.7 281  64.0 11.5 274
22.4 26.5 274  89.3 18.3 279  32.3 28.3 274  31.1 53.1 279
43.0 17.5 272  79.3 42.3 283  99.9 57.9 291   1.8 24.1 273
81.7 73.5 294  22.9 32.0 273  64.9 67.5 292  76.5 56.3 285
78.7 11.7 276  61.8 99.3 307  49.1 86.6 299  40.0 35.8 273
69.3  3.8 278  23.4  9.3 270  66.3 94.3 304  71.3  6.5 275
 9.7 54.4 280  85.2 81.7 300  30.3 60.9 284  94.6 94.3 309
10.6 10.3 271  73.0 43.0 280   4.9 50.7 280  19.0 79.4 289
 2.4 73.1 287  77.7 25.2 278   8.4 27.1 276  93.5 19.7 279
 0.2 34.5 275  50.4 91.3 302  55.7 26.2 279  50.3  2.3 274
16.3 84.4 293  19.0  6.9 272  57.1 92.3 303  61.0  0.4 275
10.7 18.7 271  15.2 43.5 277  67.0 87.4 301  79.0 54.0 285
36.0 53.3 279  58.3 52.1 282  56.6 79.7 294  40.4 32.4 275
48.9 64.1 286  54.0 54.9 281  27.5 48.5 279  36.4 30.3 275
10.5 31.0 273  87.0 39.4 283  47.9 37.5 274  64.7 63.4 288
 0.5 90.8 294  22.8 22.4 275  31.1 78.8 291  93.6 49.8 290
 2.5 39.3 273  83.6 25.6 282  49.8 24.1 278  73.1 91.8 305
30.5 90.6 297  26.0 61.2 284  58.4 66.2 289  30.5  4.3 273
38.3 85.6 298  89.2 96.6 309  53.4  6.3 275  27.3 12.8 271
43.4 56.5 281  99.5 86.9 305  85.8 22.8 281  83.0 10.9 278
24.8 16.7 271  51.1 18.8 275  59.0 54.3 283  35.5 91.4 298
18.1 56.0 279  78.0 36.4 277  56.8  6.9 275  21.1 44.5 277
73.9 75.9 296  54.2  0.1 274  33.2 75.1 290  38.2  3.3 274
15.2 14.7 272  15.9 84.2 292  60.2 95.2 304   9.8 27.2 276
91.2 56.4 289  94.7 86.9 303  56.7 49.6 281  24.2  9.5 270
43.0 17.0 272  85.9 10.7 278  53.9 41.1 276  30.4 63.4 286
62.8 86.3 299  76.8 24.6 279  31.6 94.0 300  26.9 73.8 287
18.9 68.4 284  99.4 37.2 285  79.1  3.3 277  34.9 74.7 289
 6.4 33.8 277  48.4 82.2 294  86.0 58.0 289  92.0 60.4 293
50.2 91.6 300  12.2 38.3 275  72.7 48.9 283  82.7 34.1 279
77.0 51.0 286  86.6 15.8 278  42.0 42.7 277  99.3  8.2 278
17.4 70.6 286  11.2 92.4 295  60.2 28.8 280  92.0 73.3 297
25.3 30.6 273  36.6  8.9 274  34.2  4.4 273  26.6 54.7 278
 1.7 27.4 278  49.6  1.1 275  62.8 89.3 301  28.0 49.3 279
51.2 75.1 293  59.3 93.5 304  83.6 90.5 304  79.4 87.0 302
78.0 28.3 281  16.8 19.1 272   9.1 81.2 292  23.7 55.8 277
75.5 21.3 279  64.4 43.3 279  38.9 98.9 303  22.5 87.9 293
96.7 37.9 285  92.3 93.9 308  16.9 25.4 273  15.2 61.5 283
73.8 94.0 306  57.4 97.2 305  73.2  4.9 276  39.2 82.3 294
95.7 99.4 315  66.0 98.4 306  95.3 26.9 283  45.4 75.3 291
64.8 15.4 276  69.8 55.4 284  36.3 74.9 290   9.9 22.2 276
65.8 13.9 276  13.0 82.0 293  95.6 77.2 301  32.5 55.6 279
45.8 35.5 275  62.2  6.6 274  25.2 51.2 279  92.4  8.1 277
40.5 35.3 273   9.9  3.9 271  43.5 44.0 278  68.6 61.3 287
64.2 77.5 296  57.6 81.6 294  69.5 64.7 291  64.3 95.1 304
 2.8 62.4 283  33.2 83.3 294  10.7 71.0 285  24.3 88.2 294
94.5 32.2 283  21.0 67.6 286  20.1 71.6 286  85.2 71.3 296
94.8 30.7 283  53.4 92.0 301  81.0 50.0 287  54.6 29.9 277
71.1 90.1 303  15.2  2.9 271  83.6 17.8 278  76.0 21.8 279
55.6 37.4 275  86.7 83.7 303  43.6 83.6 295  44.2 31.7 274
90.0 83.3 300   6.2  0.5 270  42.2 87.7 298  31.7  4.3 273
91.4 41.2 285  78.0 50.6 286  27.1 56.1 278  72.6 63.9 291
29.3 49.9 281  49.0 36.9 275  13.9 53.5 280  93.1 83.2 300
73.0 61.6 289  63.1 27.5 280  38.3 72.5 287  72.7 34.2 277
 6.9 32.3 274  17.1 58.6 280  19.6 94.6 297   2.7 36.5 276
34.5  5.5 275  98.6 95.9 313   9.1 71.1 285  88.6 55.8 287
26.8 78.5 289  64.8 66.6 292  59.7 25.7 280  47.3 70.2 288
 6.1 94.4 296  50.5 82.7 296   9.1 41.6 276  86.0 71.0 296
75.2 69.8 293  73.3 84.8 300  42.5 15.9 274  56.1 76.1 292
87.9 41.2 285  65.1  9.8 274  79.0 41.2 282  44.6 65.1 287
54.7 68.3 289  57.0 26.8 279   8.7 12.3 270  33.7 61.9 286
25.0 55.8 278  69.3 94.9 306  49.2 64.6 287  78.2 93.7 307
47.9 26.6 277  96.9 51.4 292  39.6 73.4 287  37.9 66.1 285
94.5 71.4 296  51.6 18.3 276  37.6 73.2 287  68.5 10.7 274
46.7  9.6 273  87.4 38.9 282  45.6 43.9 277  70.7 76.9 296
82.8 53.6 287  82.5 55.4 286  37.8  5.1 275  89.8 96.1 309
63.9  4.9 276   2.0 11.7 270  31.3 59.2 282  93.9 65.3 296
47.9 93.0 301  29.9 36.0 274  14.6 28.3 274  17.5 70.1 286
 2.6 68.5 282  23.1 12.0 268  36.8 20.4 273  80.9  9.0 276
39.2  0.0 274  26.2 44.3 276  81.9 12.9 277   3.2 21.4 272
76.9 76.7 297  88.6  7.7 277   9.7  8.4 273  26.7 91.5 296
73.8  6.1 276  33.7 39.3 276  64.0 58.4 286   5.7 91.2 295
85.8 93.8 307  85.8 39.1 281  93.9 63.4 295  53.1 46.3 278   
51.9 42.9 277  16.8 75.7 288  29.2 66.9 285  37.4 72.5 287
;

The initial step is to explore the data set by inspecting the data spatial distribution. Run PROC VARIOGRAM, specifying the NOVARIOGRAM option in the COMPUTE statement as follows:

ods graphics on;
proc variogram data=ozoneSet; 
   compute novariogram nhc=35;
   coord xc=East yc=North;
   var Ozone;
run;

The result is a scatter plot of the observed data shown in Output 102.2.1. The scatter plot suggests an almost uniform spread of the measurements throughout the prediction area. No direct inference can be made about the existence of a surface trend in the data. However, the apparent stratification of ozone values in the northeast–southwest direction might indicate a nonrandom trend.

Output 102.2.1: Ozone Observation Data Scatter Plot

 Ozone Observation Data Scatter Plot


You need to define the size and count of the data classes by specifying suitable values for the LAGDISTANCE= and MAXLAGS= options, respectively. Compared to the smaller sample of thickness data used in Getting Started: VARIOGRAM Procedure, the larger size of the ozoneSet data results in more densely populated distance classes for the same value of the NHCLASSES= option. After you experiment with a variety of values for the NHCLASSES= option, you can adjust LAGDISTANCE= to have a relatively small number. Then you can account for a large value of MAXLAGS= so that you obtain many sample semivariogram points within your data correlation range. Specifying these values requires some exploration, for which you might need to return to this point from a later stage in your semivariogram analysis. For illustration purposes you now specify NHCLASSES=35.

Your choice of NHCLASSES=35 yields the pairwise distance intervals table in Output 102.2.2 and the corresponding histogram in Output 102.2.3.

Output 102.2.2: Pairwise Distance Intervals Table

Pairwise Distance Intervals
Lag
Class
Bounds Number of Pairs Percentage
of Pairs
0 0.00 2.01 52 0.12%
1 2.01 6.03 420 0.94%
2 6.03 10.06 815 1.82%
3 10.06 14.08 1143 2.55%
4 14.08 18.10 1518 3.38%
5 18.10 22.12 1680 3.75%
6 22.12 26.15 1931 4.31%
7 26.15 30.17 2135 4.76%
8 30.17 34.19 2285 5.09%
9 34.19 38.21 2408 5.37%
10 38.21 42.24 2551 5.69%
11 42.24 46.26 2444 5.45%
12 46.26 50.28 2535 5.65%
13 50.28 54.30 2487 5.55%
14 54.30 58.33 2460 5.48%
15 58.33 62.35 2391 5.33%
16 62.35 66.37 2302 5.13%
17 66.37 70.39 2285 5.09%
18 70.39 74.41 2079 4.64%
19 74.41 78.44 1786 3.98%
20 78.44 82.46 1640 3.66%
21 82.46 86.48 1493 3.33%
22 86.48 90.50 1243 2.77%
23 90.50 94.53 925 2.06%
24 94.53 98.55 710 1.58%
25 98.55 102.57 421 0.94%
26 102.57 106.59 274 0.61%
27 106.59 110.62 200 0.45%
28 110.62 114.64 120 0.27%
29 114.64 118.66 55 0.12%
30 118.66 122.68 35 0.08%
31 122.68 126.71 14 0.03%
32 126.71 130.73 11 0.02%
33 130.73 134.75 2 0.00%
34 134.75 138.77 0 0.00%
35 138.77 142.80 0 0.00%


Notice the overall high pair count in the majority of classes in Output 102.2.2. You can see that even for higher values of NHCLASSES= the classes are still sufficiently populated for your semivariogram analysis according to the rule of thumb stated in the section Choosing the Size of Classes. Based on the displayed information in Output 102.2.3, you specify LAGDISTANCE=4 km. You can further experiment with smaller lag sizes to obtain more points in your sample semivariogram.

You can focus on the MAXLAGS= specification at a later point. The important step now is to investigate the presence of trends in the measurement. The following section makes a suggestion about how to remove surface trends from your data and then continues the semivariogram analysis with the detrended data.

Output 102.2.3: Distribution of Pairwise Distances for Ozone Observation Data

 Distribution of Pairwise Distances for Ozone Observation Data