STRATA
variables < / options> ;
The STRATA statement names variables that partition the input data set into nonoverlapping subgroups (strata). The combinations of levels of STRATA variables define the strata. PROC SURVEYSELECT then selects independent samples from these strata, according to the selection method and design parameters that you specify in the PROC SURVEYSELECT statement. For information about the use of stratification in sample design, see Lohr (2010); Kalton (1983); Kish (1965, 1987); Cochran (1977).
The STRATA variables are one or more variables in the DATA= input data set. These variables can be either character or numeric, but the procedure treats them as categorical variables. The formatted values of the STRATA variables determine the STRATA variable levels. Thus, you can use formats to group values into levels. See the FORMAT procedure in the Base SAS Procedures Guide and the FORMAT statement and SAS formats in SAS Formats and Informats: Reference.
The STRATA variables function much like BY variables, and PROC SURVEYSELECT expects the input data set to be sorted in order of the STRATA variables.
If you specify a CONTROL statement, or if you specify METHOD=PPS, the input data set must be sorted in ascending order by the STRATA variables. This means you cannot use the STRATA option NOTSORTED or DESCENDING when you specify a CONTROL statement or METHOD=PPS.
If your input data set is not sorted by the STRATA variables in ascending order, use one of the following alternatives:
Sort the data by using the SORT procedure with the STRATA variables in a BY statement.
Specify the NOTSORTED or DESCENDING option in the STRATA statement (when you do not specify a CONTROL statement or METHOD=PPS). The NOTSORTED option does not mean that the data are unsorted but rather that the data are arranged in groups (according to values of the STRATA variables) and that these groups are not necessarily in alphabetical or increasing numeric order.
Create an index on the STRATA variables by using the DATASETS procedure (in Base SAS software).
For more information about BY-group processing, see the discussion in SAS Language Reference: Concepts. For more information about the DATASETS procedure, see the discussion in the Base SAS Procedures Guide.
The STRATA options request allocation of the total sample size among the strata. You can use the ALLOC= option to specify the allocation method. Available allocation methods include proportional allocation (ALLOC=PROP), optimal allocation (ALLOC=OPTIMAL), and Neyman allocation (ALLOC=NEYMAN). See the section Sample Size Allocation for details about these methods.
Instead of requesting that PROC SURVEYSELECT compute the sample allocation, you can provide the allocation proportions by using the ALLOC=(values) option or the ALLOC=SAS-data-set option. Then PROC SURVEYSELECT allocates the total sample size among the strata according to the proportions that you provide. Allocation proportions are relative stratum sample sizes, , where is the stratum h sample size and n is the total sample size.
You can use the SAMPSIZE= option in the PROC SURVEYSELECT statement to specify the total sample size to be allocated among the strata. Alternatively, you can specify the desired margin of error in the MARGIN= option, and the procedure determines the stratum sample sizes that are required to achieve that margin. See the section Specifying the Margin of Error for details.
When you request sample allocation, by default PROC SURVEYSELECT computes the allocation of the total sample size among the strata and then selects the sample. If you specify the NOSAMPLE option, the procedure computes the allocation but does not select the sample. In this case the OUT= output data set contains the stratum sample sizes that are computed according to the specified allocation method. See the section Allocation Output Data Set for details.
You can use the ALLOC= option with any selection method except METHOD=PPS_BREWER and METHOD=PPS_MURTHY, which select two units from each stratum.
Table 99.2 summarizes the options available in the STRATA statement. Descriptions of the options follow in alphabetical order.
Table 99.2: STRATA Statement Options for Sample Allocation
Option |
Description |
---|---|
Specifies the allocation method |
|
Provides allocation proportions |
|
Specifies the minimum sample size per stratum |
|
Specifies the confidence level |
|
Provides stratum costs |
|
Specifies the margin of error |
|
Allocates but does not select the sample |
|
Displays additional allocation statistics |
|
Provides stratum variances |
You can specify the following options in the STRATA statement after a slash (/):