The next sample design for the customer satisfaction survey uses stratification by State
and also control sorting by Type
and Usage
within State
. After stratification and control sorting, customers are selected by systematic random sampling within strata. Selection
by systematic sampling, together with control sorting before selection, spreads the sample uniformly over the range of type
and usage values within each stratum (state). The following PROC SURVEYSELECT statements select a probability sample of customers
from the Customers
data set according to this design:
title1 'Customer Satisfaction Survey'; title2 'Stratified Sampling with Control Sorting'; proc surveyselect data=Customers method=sys rate=.02 seed=1234 out=SampleControl; strata State; control Type Usage; run;
The STRATA statement names the stratification variable State
. The CONTROL statement names the control variables Type
and Usage
. In the PROC SURVEYSELECT statement, the METHOD=SYS option requests systematic random sampling. The RATE= option specifies
a sampling rate of 2% for each stratum. The SEED= option specifies the initial seed for random number generation.
Figure 99.7 displays the output from PROC SURVEYSELECT, which summarizes the sample selection. A sample of 271 customers is selected
by using systematic random sampling within strata determined by State
. The sampling frame Customers
is sorted by control variables Type
and Usage
within strata. The type of sorting is serpentine, which is the default when SORT=NEST is not specified. See the section Sorting by CONTROL Variables for a description of serpentine sorting. The sorted data set replaces the input data set. (To leave the input data set unsorted
and store the sorted input data in another data set, use the OUTSORT= option.) The output data set SampleControl
contains the sample of customers.
Figure 99.7: Sample Selection Summary
Customer Satisfaction Survey |
Stratified Sampling with Control Sorting |
Selection Method | Systematic Random Sampling |
---|---|
Strata Variable | State |
Control Variables | Type |
Usage | |
Control Sorting | Serpentine |
Input Data Set | CUSTOMERS |
---|---|
Random Number Seed | 1234 |
Stratum Sampling Rate | 0.02 |
Number of Strata | 4 |
Total Sample Size | 270 |
Output Data Set | SAMPLECONTROL |