The SPP Procedure

Getting Started: SPP Procedure

This example uses forestry data, which are shown in Figure 93.4, to show how you can use PROC SPP to fit a model for the first-order intensity of a spatial point pattern. The Sashelp.BEI data set contains the locations of 3,604 trees in tropical rain forests. A study window of 1,000 $\times $ 500 square kilometers is appropriate. The data set also contains covariates that are represented by the variables Gradient and Elevation, which are collected at 20,301 locations on a regular grid across the study region. The variable Trees distinguishes the event observations in the data set. These data are a part of a much larger data set, which contains the positions of hundreds of thousands of trees that belong to thousands of species (Condit, 1998; Hubbell and Foster, 1983; Condit, Hubbell, and Foster, 1996).[39] The Sashelp.BEI data set contains five variables:

  • X and Y: the X and Y coordinates for locations of trees and for measurements of the height and slope of the study area

  • Trees: a 0/1 variable that indicates which observation corresponds to locations of trees: 1 indicates the presence of a tree, and 0 indicates absence

  • Elevation: which measures how far the study area is above sea level

  • Gradient: which measures the slope of the study area

The following statements produce a plot of the event observations (which is shown in Figure 93.4) and plots of the covariates (which are shown in Figure 93.5 and Figure 93.6).

ods graphics on;
proc spp data=sashelp.bei plots(equate)=(trends observations);
   process trees = (x, y /area=(0,0,1000,500) Event=Trees);
   trend grad = field(x,y, gradient);
   trend elev = field(x,y, elevation);
run;

In addition, the preceding statements produce three tables, which are shown in Figure 93.1, Figure 93.2, and Figure 93.3. The number of observations in the combined data set is shown in Figure 93.1; it includes both the number of event observations and the number of covariate observations.

Figure 93.1: Number of Events and Number of Covariate Observations

The SPP Procedure

Observations Read 23905
Observations Used 23905
Event Observations Read 3604
Event Observations Used 3604
Gradient Observations Read 20301
Gradient Observations Used 20301
Elevation Observations Read 20301
Elevation Observations Used 20301



Figure 93.2 provides some summary information about the point pattern, including the average intensity or the number of events per unit area.

Figure 93.2: Exploratory Information about the Point Pattern

Summary of Point Pattern
Data Type Point Pattern
Pattern Name trees
Region Type User Defined Window
Region X Range [0,1000] Units
Region Y Range [0,500] Units
Region X Size 1000 Units
Region Y Size 500 Units
Region Area 500000 Square Units
Observations in Window 3604
Average Intensity 0.007208
Grid Nodes in X 50
Grid Nodes in Y 50
Grid Nodes in Window 2500
Quadrat Dimension in X 10
Quadrat Dimension in Y 10



Figure 93.3 provides the results of a default $10 \times 10$ quadrat-based Pearson chi-square test for CSR.

Figure 93.3: Pearson Chi-Square Test for CSR

Pearson Chi-Square Test for CSR
Expected
Frequency
DF Dispersion
Index
Chi-Square Pr > ChiSq
36.04 99 33.222 3288.95 <.0001



Figure 93.4: Spatial Point Pattern of Tropical Rain forest Trees

Spatial Point Pattern of Tropical Rain forest Trees


Figure 93.5: Spatial Covariate Gradient

Spatial Covariate Gradient


Figure 93.6: Spatial Covariate Elevation

Spatial Covariate Elevation


The variables Gradient and Elevation are both continuous functions, because any arbitrary point that is chosen in the study area has a value for both these variables. However, these variables are sampled at select points where measuring them is easy. In spatial analysis and geographic information systems (GISs), such variables are termed field variables and are associated with a spatial trend. You can include such variables in the SPP procedure by using the TREND statement.

The sashelp.bei data contains combined information for both the point pattern and the spatial covariates. However, the SPP procedure requires you to identify the point pattern event identifier separately. This is done by using the EVENT= option in the PROCESS statement to specify that the variable Trees identifies the event.

It is natural to suppose that tree growth is affected by the gradient and elevation of the surrounding land. Hence, you can use the gradient and elevation in a parametric model to model the intensity of tree growth in the study area. Such a model is an inhomogeneous Poisson process (Baddeley, 2010, p. 354), whose first-order intensity, $\lambda (s)$, is log linear in the covariates. You can use the MODEL statement to compose models for a point pattern’s intensity. In the MODEL statement, you specify the response pattern on the left side. The response pattern is a process that you define before you specify the MODEL statement. You can specify any covariates that are likely to influence the target point pattern on the right side of the MODEL statement syntax.

To obtain a plot of the model-based intensity estimate, you specify the PLOTS=INTENSITY option. In addition, if you want to request residual diagnostics, you can specify the PLOTS=RESIDUAL option. If you want to specify a response grid to obtain the intensity estimates, you can use the GRID option in the MODEL statement. The following statements explore the influence of the covariates Elevation and Gradient on the intensity of Tree presence:

proc spp data=sashelp.bei plots(equate)=(residual intensity);
   process trees = (x,y /area=(0,0,1000,500) event=Trees);
   trend elev = field(x,y,elevation);
   trend grad = field(x,y,gradient);
   model trees = elev grad / grid(64,64) residual(B=70) ;
run;

In addition to the tables shown in previous figures, these statements produce a table that contains the parameter estimates (Figure 93.7) and a fit summary table (Figure 93.8). The parameter estimates designate the intercept value and the values of the factors of the model terms. The relative values of the parameter estimates indicate how much each factor contributes to the model. In this case, Gradient is much more important in modeling where trees grow than Elevation, although both are significant.

Figure 93.7: Parameter Estimates Table

The SPP Procedure

Parameter Estimates
Parameter Estimate Standard
Error
DF t Value Approx
Pr > |t|
Intercept -8.5672 0.3415 2 -25.09 0.0016
Elevation 0.02146 0.002290 2 9.37 0.0112
Gradient 5.8616 0.2567 2 22.83 0.0019



The fit summary table in Figure 93.8 shows the model fit statistics. You can use these values to compare multiple fits from different models and to select an optimal model in your study.

Figure 93.8: Fit Summary Table

Fit Statistics
Criterion Value
-2 Log Likelihood 42290.0
AIC (smaller is better) 42296.0
BIC (smaller is better) 42316.8



The corresponding fitted intensity is shown in Figure 93.9.

Figure 93.9: Intensity Estimates of Tree presence in Study Area

Intensity Estimates of Tree presence in Study Area


The resulting residual diagnostics are shown in Figure 93.10.

Figure 93.10: Residual Diagnostics for Fitted Log-Intensity Model

Residual Diagnostics for Fitted Log-Intensity Model


The residual diagnostics plot in Figure 93.10 provides an informal assessment of the fitted parametric model. In particular, the smoothed residual plot in the right bottom corner reveals a trend in the residual that is not accounted for by the model. In addition, the lurking variable plots with respect to the coordinate variables show significant deviation from the $2\sigma $ limits, indicating that the model does not account for a variation in intensity with respect to these variables.



[39] This data set is used with kind permission from Professor S. Hubbell, with acknowledgment of the support of the Center for Tropical Forest Science of the Smithsonian Tropical Research Institute and the primary granting agencies that have supported the BCI plot. The BCI forest dynamics research project was made possible by National Science Foundation grants to Stephen P. Hubbell: DEB-0640386, DEB-0425651, DEB-0346488, DEB-0129874, DEB-00753102, DEB-9909347, DEB-9615226, DEB-9615226, DEB-9405933, DEB-9221033, DEB-9100058, DEB-8906869, DEB-8605042, DEB-8206992, DEB-7922197, support from the Center for Tropical Forest Science, the Smithsonian Tropical Research Institute, the John D. and Catherine T. MacArthur Foundation, the Mellon Foundation, the Small World Institute Fund, and numerous private individuals, and through the hard work of over 100 people from 10 countries over the past two decades. The plot project is part of the Center for Tropical Forest Science, a global network of large-scale demographic tree plots.