The TABLES statement requests one-way to n-way frequency and crosstabulation tables and statistics for those tables.
If you omit the TABLES statement, PROC FREQ generates one-way frequency tables for all data set variables that are not listed in the other statements.
The following argument is required in the TABLES statement.
specify the frequency and crosstabulation tables to produce. A request is composed of one variable name or several variable names separated by asterisks. To request a one-way frequency table, use a single variable. To request a two-way crosstabulation table, use an asterisk between two variables. To request a multiway table (an n-way table, where n>2), separate the desired variables with asterisks. The unique values of these variables form the rows, columns, and strata of the table. You can include up to 50 variables in a single multiway table request.
For two-way to multiway tables, the values of the last variable form the crosstabulation table columns, while the values of
the next-to-last variable form the rows. Each level (or combination of levels) of the other variables forms one stratum. PROC
FREQ produces a separate crosstabulation table for each stratum. For example, a specification of A
*B
*C
*D
in a TABLES statement produces k tables, where k is the number of different combinations of values for A
and B
. Each table lists the values for C
down the side and the values for D
across the top.
You can use multiple TABLES statements in the PROC FREQ step. PROC FREQ builds all the table requests in one pass of the data, so that there is essentially no loss of efficiency. You can also specify any number of table requests in a single TABLES statement. To specify multiple table requests quickly, use a grouping syntax by placing parentheses around several variables and joining other variables or variable combinations. For example, the statements shown in Table 40.8 illustrate grouping syntax.
Table 40.8: Grouping Syntax
TABLES Request |
Equivalent to |
---|---|
|
|
( |
|
( |
|
|
|
( |
|
The TABLES statement variables are one or more variables from the DATA= input data set. These variables can be either character or numeric, but the procedure treats them as categorical variables. PROC FREQ uses the formatted values of the TABLES variable to determine the categorical variable levels. So if you assign a format to a variable with a FORMAT statement, PROC FREQ formats the values before dividing observations into the levels of a frequency or crosstabulation table. See the FORMAT procedure in the Base SAS Procedures Guide and the FORMAT statement and SAS formats in SAS Formats and Informats: Reference.
If you use PROC FORMAT to create a user-written format that combines missing and nonmissing values into one category, PROC FREQ treats the entire category of formatted values as missing. See the discussion in the section Grouping with Formats for more information.
By default, the frequency or crosstabulation table lists the values of both character and numeric variables in ascending order based on internal (unformatted) variable values. You can change the order of the values in the table by specifying the ORDER= option in the PROC FREQ statement. To list the values in ascending order by formatted value, use ORDER=FORMATTED.
If you request a one-way frequency table for a variable without specifying options, PROC FREQ produces frequencies, cumulative frequencies, percentages of the total frequency, and cumulative percentages for each value of the variable. If you request a two-way or an n-way crosstabulation table without specifying options, PROC FREQ produces crosstabulation tables that include cell frequencies, cell percentages of the total frequency, cell percentages of row frequencies, and cell percentages of column frequencies. The procedure excludes observations with missing values from the table but displays the total frequency of missing observations below each table.
Table 40.9 lists the options available in the TABLES statement. Descriptions of the options follow in alphabetical order.
Table 40.9: TABLES Statement Options
Option |
Description |
---|---|
Control Statistical Analysis |
|
Requests tests and measures of classification agreement |
|
Requests tests and measures of association produced by the |
|
Sets confidence level for confidence limits |
|
Requests binomial proportions, confidence limits, and tests |
|
for one-way tables |
|
Requests chi-square tests and measures based on chi-square |
|
Requests confidence limits for MEASURES statistics |
|
Requests all Cochran-Mantel-Haenszel statistics |
|
Requests CMH correlation statistic, adjusted odds ratios, |
|
and adjusted relative risks |
|
Requests CMH correlation and row mean scores (ANOVA) |
|
statistics, adjusted odds ratios, and adjusted relative risks |
|
Requests Fisher’s exact test for tables larger than |
|
Requests Gail-Simon test for qualitative interactions |
|
Requests Jonckheere-Terpstra test |
|
Requests measures of association |
|
Treats missing values as nonmissing |
|
Requests the odds ratio for tables |
|
Requests polychoric correlation |
|
Requests relative risks for tables |
|
Requests risks and risk differences for tables |
|
Specifies type of row and column scores |
|
Requests Cochran-Armitage test for trend |
|
Control Additional Table Information |
|
Displays cell contributions to the Pearson chi-square statistic |
|
Displays cumulative column percentages |
|
Displays deviations of cell frequencies from expected values |
|
Displays expected cell frequencies |
|
Displays missing value frequencies |
|
Displays Pearson residuals in the CROSSLIST table |
|
Displays kappa coefficient weights |
|
Displays row and column scores |
|
Includes all possible combinations of variable levels in the |
|
Displays standardized residuals in the CROSSLIST table |
|
Displays percentages of total frequency for n-way tables (n>2) |
|
Control Displayed Output |
|
Specifies contents label for crosstabulation tables |
|
Displays crosstabulation tables in ODS column format |
|
Formats frequencies in crosstabulation tables |
|
Displays two-way to n-way tables in list format |
|
Specifies maximum number of levels to display in one-way tables |
|
Suppresses display of column percentages |
|
Suppresses display of cumulative frequencies and percentages |
|
Suppresses display of frequencies |
|
Suppresses display of percentages |
|
Suppresses display of crosstabulation tables but displays statistics |
|
Suppresses display of row percentages |
|
Suppresses zero frequency levels in the CROSSLIST table, |
|
Suppresses log warning message for the chi-square test |
|
Produce Statistical Graphics |
|
Requests plots from ODS Graphics |
|
Create an Output Data Set |
|
Names an output data set to contain frequency counts |
|
Includes cumulative frequencies and percentages in the |
|
output data set for one-way tables |
|
Includes expected frequencies in the output data set |
|
Includes row, column, and two-way table percentages in the |
|
output data set |
You can specify the following options in a TABLES statement.
requests tests and measures of classification agreement for square tables. This option provides the simple and weighted kappa coefficients along with their standard errors and confidence limits. For multiway tables, the AGREE option also produces the overall simple and weighted kappa coefficients (along with their standard errors and confidence limits) and tests for equal kappas among strata. For tables, this option provides McNemar’s test; for square tables that have more than two response categories (levels), this option provides Bowker’s test of symmetry. For multiway tables that have two response categories, the AGREE option also produces Cochran’s Q test. For more information, see the section Tests and Measures of Agreement.
Measures of agreement can be computed only for square tables, where the number of rows equals the number of columns. If your table is not square because of observations that have zero weights, you can specify the ZEROS option in the WEIGHT statement to include these observations in the analysis. For more information, see the section Tables with Zero Rows and Columns.
You can set the level for agreement confidence limits by specifying the ALPHA= option in the TABLES statement. The default of ALPHA=0.05 produces 95% confidence limits.
You can specify the TEST statement to request asymptotic tests for the simple and weighted kappa coefficients. You can specify the EXACT statement to request McNemar’s exact test (for tables) and exact tests for the simple and weighted kappa coefficients. For more information, see the section Exact Statistics.
The weighted kappa coefficient is computed by using agreement weights that reflect the relative agreement between pairs of variable levels. To specify the type of agreement weights and to display the agreement weights, you can specify the following options:
displays the agreement weights that PROC FREQ uses to compute the weighted kappa coefficient. Agreement weights reflect the relative agreement between pairs of variable levels. By default, PROC FREQ uses the Cicchetti-Allison form of agreement weights. If you specify the WT=FC option, the procedure uses the Fleiss-Cohen form of agreement weights. For more information, see the section Weighted Kappa Coefficient.
requests Fleiss-Cohen agreement weights in the weighted kappa computation. By default, PROC FREQ uses Cicchetti-Allison agreement weights to compute the weighted kappa coefficient. Agreement weights reflect the relative agreement between pairs of variable levels. For more information, see the section Weighted Kappa Coefficient.
requests all tests and measures that are produced by the CHISQ , MEASURES , and CMH options. You can control the number of CMH statistics to compute by specifying the CMH1 or CMH2 option.
specifies the level of confidence limits. The value of must be between 0 and 1; a confidence level of produces % confidence limits. By default ALPHA=0.05, which produces 95% confidence limits.
This option applies to confidence limits that you request in the TABLES statement. The ALPHA= option in the EXACT statement applies to confidence limits for Monte Carlo estimates of exact p-values, which you request by specifying the MC option in the EXACT statement.
requests the binomial proportion for one-way tables. When you specify this option, by default PROC FREQ provides the asymptotic standard error, asymptotic Wald and exact (Clopper-Pearson) confidence limits, and the asymptotic equality test for the binomial proportion.
You can specify binomial-options in parentheses after the BINOMIAL option. The LEVEL= binomial-option identifies the variable level for which to compute the proportion. If you do not specify this option, PROC FREQ computes the proportion for the first level that appears in the one-way frequency table. The P= binomial-option specifies the null proportion for the binomial tests. If you do not specify this option, PROC FREQ uses 0.5 as the null proportion for the binomial tests.
You can also specify binomial-options to request additional tests and confidence limits for the binomial proportion. The EQUIV , NONINF , and SUP binomial-options request tests of equivalence, noninferiority, and superiority, respectively. The CL= binomial-option requests confidence limits for the binomial proportion.
You can specify the level for the binomial confidence limits in the ALPHA= option. By default, ALPHA=0.05, which produces 95% confidence limits. As part of the noninferiority, superiority, and equivalence analyses, PROC FREQ provides null-based equivalence limits that have a confidence coefficient of % (Schuirmann, 1999). In these analyses, the default of ALPHA=0.05 produces 90% equivalence limits. For more information, see the sections Noninferiority Test and Equivalence Test.
To request exact tests for the binomial proportion, you can specify the BINOMIAL option in the EXACT statement. PROC FREQ computes exact p-values for all binomial tests that you request, which can include noninferiority, superiority, and equivalence tests, in addition to the equality test that the BINOMIAL option produces by default.
For more information, see the section Binomial Proportion.
Table 40.10 summarizes the binomial-options.
Table 40.10: BINOMIAL Options
Option |
Description |
---|---|
Requests continuity correction |
|
Specifies the variable level |
|
Includes the level in the output data sets |
|
Specifies the null proportion |
|
Request Confidence Limits |
|
Requests Agresti-Coull confidence limits |
|
Requests Blaker confidence limits |
|
Requests exact (Clopper-Pearson) confidence limits |
|
Requests Jeffreys confidence limits |
|
Requests likelihood ratio confidence limits |
|
Requests logit confidence limits |
|
Requests exact mid-p confidence limits |
|
Requests Wald confidence limits |
|
Requests Wilson (score) confidence limits |
|
Request Tests |
|
Requests an equivalence test |
|
Specifies the test margin |
|
Requests a noninferiority test |
|
Requests a superiority test |
|
Specifies the test variance |
You can specify the following binomial-options:
requests confidence limits for the binomial proportion. You can specify one or more types of confidence limits. When you specify only one type, you can omit the parentheses around the request. PROC FREQ displays the confidence limits in the "Binomial Confidence Limits" table.
The ALPHA= option determines the level of the confidence limits that the CL= binomial-option provides. By default, ALPHA=0.05, which produces 95% confidence limits for the binomial proportion.
You can specify the CL= binomial-option with or without requests for binomial tests. The confidence limits that CL= produces do not depend on the tests that you request and do not use the value of the test margin (which you can specify in the MARGIN= binomial-option).
If you do not specify the CL= binomial-option, the BINOMIAL option displays Wald and exact (Clopper-Pearson) confidence limits in the "Binomial Proportion" table.
You can specify the following types:
requests Agresti-Coull confidence limits for the binomial proportion. For more information, see the section Agresti-Coull Confidence Limits.
requests Blaker confidence limits for the binomial proportion. For more information, see the section Blaker Confidence Limits.
requests exact (Clopper-Pearson) confidence limits for the binomial proportion. For more information, see the section Exact (Clopper-Pearson) Confidence Limits.
If you do not specify the CL= binomial-option, PROC FREQ displays Wald and exact (Clopper-Pearson) confidence limits in the "Binomial Proportion" table. To request exact tests for the binomial proportion, you can specify the BINOMIAL option in the EXACT statement.
requests Jeffreys confidence limits for the binomial proportion. For more information, see the section Jeffreys Confidence Limits.
requests likelihood ratio confidence limits for the binomial proportion. For more information, see the section Likelihood Ratio Confidence Limits.
requests logit confidence limits for the binomial proportion. For more information, see the section Logit Confidence Limits.
requests exact mid-p confidence limits for the binomial proportion. For more information, see the section Mid-p Confidence Limits.
requests Wald confidence limits for the binomial proportion. For more information, see the section Wald Confidence Limits.
If you specify CL=WALD(CORRECT), the Wald confidence limits include a continuity correction. If you specify the CORRECT binomial-option, both the Wald confidence limits and the Wald tests include continuity corrections.
If you do not specify the CL= binomial-option, PROC FREQ displays Wald and exact (Clopper-Pearson) confidence limits in the "Binomial Proportion" table.
requests Wilson confidence limits for the binomial proportion. These are also known as score confidence limits. For more information, see the section Wilson (Score) Confidence Limits.
If you specify CL=WILSON(CORRECT) or the CORRECT binomial-option, the Wilson confidence limits include a continuity correction.
includes a continuity correction in the Wald confidence limits, Wald tests, and Wilson confidence limits.
You can request continuity corrections individually for Wald or Wilson confidence limits by specifying the CL=WALD(CORRECT) or CL=WILSON(CORRECT) binomial-option, respectively.
requests a test of equivalence for the binomial proportion. For more information, see the section Equivalence Test. You can specify the equivalence test margins, the null proportion, and the variance type in the MARGIN= , P= , and VAR= binomial-options, respectively. To request an exact equivalence test, you can specify the BINOMIAL option in the EXACT statement.
specifies the variable level for the binomial proportion. You can specify the level-number, which is the order in which the level appears in the one-way frequency table. Or you can specify the level-value, which is the formatted value of the variable level. The level-number must be a positive integer. You must enclose the level-value in single quotes.
By default, PROC FREQ computes the binomial proportion for the first variable level that appears in the one-way frequency table.
specifies the margin for the noninferiority, superiority, and equivalence tests, which you can request by specifying the NONINF , SUP , and EQUIV binomial-options, respectively. By default, MARGIN=0.2.
For noninferiority and superiority tests, specify a single value in the MARGIN= option. The MARGIN= value must be a positive number. You can specify value as a number between 0 and 1. Or you can specify value in percentage form as a number between 1 and 100, and PROC FREQ converts that number to a proportion. PROC FREQ treats the value 1 as 1%.
For noninferiority and superiority tests, the test limits must be between 0 and 1. The limits are determined by the null proportion value (which you can specify in the P= binomial-option) and by the margin value. The noninferiority limit is the null proportion minus the margin. By default, the null proportion is 0.5 and the margin is 0.2, which produces a noninferiority limit of 0.3. The superiority limit is the null proportion plus the margin, which is 0.7 by default.
For an equivalence test, you can specify a single MARGIN= value, or you can specify both lower and upper values. If you specify a single MARGIN= value, it must be a positive number, as described previously. If you specify a single MARGIN= value for an equivalence test, PROC FREQ uses –value as the lower margin and value as the upper margin for the test. If you specify both lower and upper values for an equivalence test, you can specify them in proportion form as numbers between –1 or 1. Or you can specify them in percentage form as numbers between –100 and 100, and PROC FREQ converts the numbers to proportions. The value of lower must be less than the value of upper.
The equivalence limits must be between 0 and 1. The equivalence limits are determined by the null proportion value (which you can specify in the P= binomial-option) and by the margin values. The lower equivalence limit is the null proportion plus the lower margin. By default, the null proportion is 0.5 and the lower margin is –0.2, which produces a lower equivalence limit of 0.3. The upper equivalence limit is the null proportion plus the upper margin, which is 0.7 by default.
For more information, see the sections Noninferiority Test and Equivalence Test.
requests a test of noninferiority for the binomial proportion. For more information, see the section Noninferiority Test. You can specify the noninferiority test margin, the null proportion, and the variance type in the MARGIN= , P= , and VAR= binomial-options, respectively. To request an exact noninferiority test, you can specify the BINOMIAL option in the EXACT statement.
includes the variables LevelNumber
and LevelValue
in all ODS output data sets that PROC FREQ produces when you specify the BINOMIAL option in the TABLES statement. The OUTLEVEL
option also includes the variables LevelNumber
and LevelValue
in the statistics output data set that PROC FREQ produces when you specify the BINOMIAL
option in the OUTPUT
statement.
The LevelNumber
and LevelValue
variables identify the analysis variable level for which PROC FREQ computes the binomial proportion. The value of LevelNumber
is the order of the level in the one-way frequency table. The value of LevelValue
is the formatted value of the level. You can specify the OUTLEVEL binomial-option with or without the LEVEL=
binomial-option.
specifies the null hypothesis proportion for the binomial tests. The null proportion value must be a positive number. You can specify value as a number between 0 and 1. Or you can specify value in percentage form (as a number between 1 and 100), and PROC FREQ converts that number to a proportion. PROC FREQ treats the value 1 as 1%. By default, P=0.5.
requests a test of superiority for the binomial proportion. For more information, see the section Superiority Test. You can specify the superiority test margin, the null proportion, and the variance type in the MARGIN= , P= , and VAR= binomial-options, respectively. To request an exact superiority test, you can specify the BINOMIAL option in the EXACT statement.
specifies the type of variance to use in the Wald tests of noninferiority, superiority, and equivalence. If you specify VAR=SAMPLE, PROC FREQ computes the variance estimate by using the sample proportion. If you specify VAR=NULL, PROC FREQ computes a test-based variance by using the null hypothesis proportion (which you can specify in the P= binomial-option). For more information, see the sections Noninferiority Test and Equivalence Test. The default is VAR=SAMPLE.
displays each table cell’s contribution to the Pearson chi-square statistic in the crosstabulation table. The cell chi-square is computed as , where frequency is the table cell frequency (count) and expected is the expected cell frequency, which is computed under the null hypothesis that the row and column variables are independent. For more information, see the section Pearson Chi-Square Test for Two-Way Tables. This option has no effect for one-way tables or for tables that are displayed in list format (which you can request by specifying the LIST option).
requests chi-square tests of homogeneity or independence and measures of association that are based on the chi-square statistic. For two-way tables, the chi-square tests include the Pearson chi-square, likelihood ratio chi-square, and Mantel-Haenszel chi-square tests. The chi-square measures include the phi coefficient, contingency coefficient, and Cramér’s V. For tables, the CHISQ option also provides Fisher’s exact test and the continuity-adjusted chi-square test. See the section Chi-Square Tests and Statistics for details.
For one-way tables, the CHISQ option provides the Pearson chi-square goodness-of-fit test. You can also request the likelihood ratio goodness-of-fit test for one-way tables by specifying the LRCHI chisq-option in parentheses after the CHISQ option. By default, the one-way chi-square tests are based on the null hypothesis of equal proportions. Alternatively, you can provide null hypothesis proportions or frequencies by specifying the TESTP= or TESTF= chisq-option, respectively. See the section Chi-Square Test for One-Way Tables for more information.
To request Fisher’s exact test for tables larger than , specify the FISHER option in the EXACT statement. Exact p-values are also available for the Pearson, likelihood ratio, and Mantel-Haenszel chi-square tests. See the description of the EXACT statement for more information.
You can specify the following chisq-options:
specifies the degrees of freedom for the chi-square tests. The value of df must not be zero. If the value of df is positive, PROC FREQ uses df as the degrees of freedom for the chi-square tests. If the value of df is negative, PROC FREQ uses df to adjust the default degrees of freedom for the chi-square tests.
By default for one-way tables, the value of df is (n – 1), where n is the number of variable levels in the table. By default for two-way tables, the value of df is (r – 1) (c – 1), where r is the number of rows in the table and c is the number of columns. See the sections Chi-Square Test for One-Way Tables and Chi-Square Tests and Statistics for more information.
If you specify a negative value of df, PROC FREQ adjusts the default degrees of freedom by adding the (negative) value of df to the default value to produce the adjusted degrees of freedom. The adjusted degrees of freedom must be positive.
The DF= chisq-option specifies or adjusts the degrees of freedom for the following chi-square tests: the Pearson and likelihood ratio goodness-of-fit tests for one-way tables; and the Pearson, likelihood ratio, and Mantel-Haenszel chi-square tests for two-way tables.
requests the likelihood ratio goodness-of-fit test for one-way tables. See the section Likelihood Ratio Chi-Square Test for One-Way Tables for more information.
By default, this test is based on the null hypothesis of equal proportions. You can provide null hypothesis proportions or frequencies by specifying the TESTP= or TESTF= chisq-option, respectively. You can request an exact likelihood ratio goodness-of-fit test by specifying the LRCHI option in the EXACT statement.
specifies null hypothesis frequencies for the one-way chi-square goodness-of-fit tests. See the section Chi-Square Test for One-Way Tables for details. You can list the null frequencies as values in parentheses after TESTF=. Or you can provide the null frequencies in a secondary input data set by specifying TESTF=SAS-data-set. The TESTF=SAS-data-set cannot be the same data set that you specify in the DATA= option. You can specify only one TESTF= or TESTP= data set in a single invocation of the procedure.
If you list the null frequencies as values, you can separate the values with blanks or commas. The values must be positive numbers. The number of values must equal the number of variable levels in the one-way table. The sum of the values must equal the total frequency for the one-way table. Order the values to match the order in which the corresponding variable levels appear in the one-way frequency table.
If you provide the null frequencies in a secondary input data set (TESTF=SAS-data-set), the variable that contains the null frequencies should be named _TESTF_
, TestFrequency
, or Frequency
. The null frequencies must be positive numbers. The number of frequencies must equal the number of levels in the one-way
frequency table, and the sum of the frequencies must equal the total frequency for the one-way table. Order the null frequencies
in the data set to match the order in which the corresponding variable levels appear in the one-way frequency table.
specifies null hypothesis proportions for the one-way chi-square goodness-of-fit tests. See the section Chi-Square Test for One-Way Tables for details. You can list the null proportions as values in parentheses after TESTP=. Or you can provide the null proportions in a secondary input data set by specifying TESTP=SAS-data-set. The TESTP=SAS-data-set cannot be the same data set that you specify in the DATA= option. You can specify only one TESTF= or TESTP= data set in a single invocation of the procedure.
If you list the null proportions as values, you can separate the values with blanks or commas. The values must be positive numbers. The number of values must equal the number of variable levels in the one-way table. Order the values to match the order in which the corresponding variable levels appear in the one-way frequency table. You can specify values in probability form as numbers between 0 and 1, where the proportions sum to 1. Or you can specify values in percentage form as numbers between 0 and 100, where the percentages sum to 100.
If you provide the null proportions in a secondary input data set (TESTP=SAS-data-set), the variable that contains the null proportions should be named _TESTP_
, TestPercent
, or Percent
. The null proportions must be positive numbers. The number of proportions must equal the number of levels in the one-way
frequency table. You can provide the proportions in probability form as numbers between 0 and 1, where the proportions sum
to 1. Or you can provide the proportions in percentage form as numbers between 0 and 100, where the percentages sum to 100.
Order the null proportions in the data set to match the order in which the corresponding variable levels appear in the one-way
frequency table.
controls the warning message for the validity of the asymptotic Pearson chi-square test. By default, PROC FREQ displays a warning message when more than 20% of the table cells have expected frequencies that are less than 5. If you specify the NOPRINT option in the PROC FREQ statement, the procedure displays the warning in the log; otherwise, the procedure displays the warning as a footnote in the chi-square table. You can use the WARN= option to suppress the warning and to include a warning indicator in the output data set.
You can specify one or more of the following types in the WARN= option. If you specify more than one type value, enclose the values in parentheses after WARN=. For example, warn = (output noprint)
.
Value of WARN= |
Description |
---|---|
OUTPUT |
Adds a warning indicator variable to the output data set |
NOLOG |
Suppresses the chi-square warning message in the log |
NOPRINT |
Suppresses the chi-square warning message in the display |
NONE |
Suppresses the chi-square warning message entirely |
If you specify the WARN=OUTPUT option, the ODS output data set ChiSq
contains a variable named Warning
that equals 1 for the Pearson chi-square observation when more than 20% of the table cells have expected frequencies that
are less than 5 and equals 0 otherwise. If you specify WARN=OUTPUT and also specify the CHISQ option in the OUTPUT
statement, the statistics output data set contains a variable named WARN_PCHI
that indicates the warning.
The WARN=NOLOG option has the same effect as the NOWARN option in the TABLES statement.
requests confidence limits for the measures of association, which you can request by specifying the MEASURES option. For more information, see the sections Measures of Association and Confidence Limits. You can set the level of the confidence limits by using the ALPHA= option. The default of ALPHA=0.05 produces 95% confidence limits.
If you omit the MEASURES option, the CL option invokes MEASURES. The CL option is equivalent to the MEASURES(CL) option.
requests Cochran-Mantel-Haenszel statistics, which test for association between the row and column variables after adjusting for the remaining variables in a multiway table. The Cochran-Mantel-Haenszel statistics include the nonzero correlation statistic, the row mean scores (ANOVA) statistic, and the general association statistic. In addition, for tables, the CMH option provides the adjusted Mantel-Haenszel and logit estimates of the odds ratio and relative risks, together with their confidence limits. For stratified tables, the CMH option provides the Breslow-Day test for homogeneity of odds ratios. (To request Tarone’s adjustment for the Breslow-Day test, specify the BDT cmh-option.) See the section Cochran-Mantel-Haenszel Statistics for details.
You can use the CMH1 or CMH2 option to control the number of CMH statistics that PROC FREQ computes.
For stratified tables, you can request Zelen’s exact test for equal odds ratios by specifying the EQOR option in the EXACT statement. See the section Zelen’s Exact Test for Equal Odds Ratios for details. You can request exact confidence limits for the common odds ratio by specifying the COMOR option in the EXACT statement. This option also provides a common odds ratio test. See the section Exact Confidence Limits for the Common Odds Ratio for details.
You can specify the following cmh-options in parentheses after the CMH option. These cmh-options, which apply to stratified tables, are also available with the CMH1 or CMH2 option.
requests Tarone’s adjustment in the Breslow-Day test for homogeneity of odds ratios. See the section Breslow-Day Test for Homogeneity of the Odds Ratios for details.
requests the Gail-Simon test for qualitative interaction, which applies to stratified tables. See the section Gail-Simon Test for Qualitative Interactions for details.
The COLUMN= option specifies the column of the risk differences to use in computing the Gail-Simon test. By default, PROC FREQ uses column 1 risk differences. If you specify COLUMN=2, PROC FREQ uses column 2 risk differences.
The GAILSIMON cmh-option has the same effect as the GAILSIMON option in the TABLES statement.
requests the Mantel-Fleiss criterion for the Mantel-Haenszel statistic for stratified tables. See the section Mantel-Fleiss Criterion for details.
requests the Cochran-Mantel-Haenszel correlation statistic. This option does not provide the CMH row mean scores (ANOVA) statistic or the general association statistic, which are provided by the CMH option. For tables larger than , the CMH1 option requires less memory than the CMH option, which can require an enormous amount of memory for large tables.
For tables, the CMH1 option also provides the adjusted Mantel-Haenszel and logit estimates of the odds ratio and relative risks, together with their confidence limits. For stratified tables, the CMH1 option provides the Breslow-Day test for homogeneity of odds ratios.
The cmh-options for CMH1 are the same as the cmh-options that are available with the CMH option. See the description of the CMH option for details.
requests the Cochran-Mantel-Haenszel correlation statistic and the row mean scores (ANOVA) statistic. This option does not provide the CMH general association statistic, which is provided by the CMH option. For tables larger than , the CMH2 option requires less memory than the CMH option, which can require an enormous amount of memory for large tables.
For tables, the CMH1 option also provides the adjusted Mantel-Haenszel and logit estimates of the odds ratio and relative risks, together with their confidence limits. For stratified tables, the CMH1 option provides the Breslow-Day test for homogeneity of odds ratios.
The cmh-options for CMH2 are the same as the cmh-options that are available with the CMH option. See the description of the CMH option for details.
specifies the label to use for crosstabulation tables in the contents file, the Results window, and the trace record. For information about output presentation, see the SAS Output Delivery System: User's Guide.
If you omit the CONTENTS= option, the contents label for crosstabulation tables is "Cross-Tabular Freq Table" by default.
Note that contents labels for all crosstabulation tables that are produced by a single TABLES statement use the same text. To specify different contents labels for different crosstabulation tables, request the tables in separate TABLES statements and use the CONTENTS= option in each TABLES statement.
To remove the crosstabulation table entry from the contents file, you can specify a null label with CONTENTS=''.
The CONTENTS= option affects only contents labels for crosstabulation tables. It does not affect contents labels for other PROC FREQ tables.
To specify the contents label for any PROC FREQ table, you can use PROC TEMPLATE to create a customized table template. The CONTENTS_LABEL attribute in the DEFINE TABLE statement of PROC TEMPLATE specifies the contents label for the table. See the chapter "The TEMPLATE Procedure" in the SAS Output Delivery System: User's Guide for more information.
displays crosstabulation tables by using an ODS column format instead of the default crosstabulation cell format. In the CROSSLIST table display, the rows correspond to the crosstabulation table cells, and the columns correspond to descriptive statistics such as frequencies and percentages. The CROSSLIST table displays the same information as the default crosstabulation table (but it uses an ODS column format). For more information about the contents of the CROSSLIST table, See the section Two-Way and Multiway Tables.
You can control the contents of a CROSSLIST table by specifying the same options available for the default crosstabulation table. These include the NOFREQ , NOPERCENT , NOROW , and NOCOL options. You can request additional information in a CROSSLIST table by specifying the CELLCHI2 , DEVIATION , EXPECTED , MISSPRINT , and TOTPCT options. You can also display standardized residuals or Pearson residuals in a CROSSLIST table by specifying the CROSSLIST(STDRES) or CROSSLIST(PEARSONRES) option, respectively; these options are not available for the default crosstabulation table. The FORMAT= and CUMCOL options have no effect on CROSSLIST tables. You cannot specify both the LIST option and the CROSSLIST option in the same TABLES statement.
For CROSSLIST tables, you can use the NOSPARSE option to suppress display of variable levels that have zero frequencies. By default, PROC FREQ displays all levels of the column variable within each level of the row variable, including any levels that have zero frequencies. By default for multiway tables that are displayed as CROSSLIST tables, the procedure displays all levels of the row variable for each stratum of the table, including any row levels with zero frequencies in the stratum.
You can specify the following options:
displays the standardized residuals of the table cells in the CROSSLIST table. The standardized residual is the ratio of (frequency – expected) to its standard error, where frequency is the table cell frequency (count) and expected is the expected table cell frequency, which is computed under the null hypothesis that the row and column variables are independent. For more information, see the section Standardized Residuals. You can display the expected values and deviations by specifying the EXPECTED and DEVIATION options, respectively.
displays the Pearson residuals of the table cells in the CROSSLIST table. The Pearson residual is the square root of the table cell’s contribution to the Pearson chi-square statistic. The Pearson residual is computed as , where frequency is the table cell frequency (count) and expected is the expected table cell frequency, which is computed under the null hypothesis that the row and column variables are independent. For more information, see the section Pearson Chi-Square Test for Two-Way Tables. You can display the expected values, deviations, and cell chi-squares by specifying the EXPECTED , DEVIATION , and CELLCHI2 options, respectively.
displays the cumulative column percentages in the cells of the crosstabulation table. The CUMCOL option does not apply to crosstabulation tables produced with the LIST or CROSSLIST option.
displays the deviations of the frequencies from the expected frequencies (frequency – expected) in the crosstabulation table. The expected frequencies are computed under the null hypothesis that the row and column variables are independent. For more information, see the section Pearson Chi-Square Test for Two-Way Tables. You can display the expected values by specifying the EXPECTED option. This option has no effect for one-way tables or for tables that are displayed in list format (which you can request by specifying the LIST option).
displays the expected cell frequencies in the crosstabulation table. The expected frequencies are computed under the null hypothesis that the row and column variables are independent. For more information, see the section Pearson Chi-Square Test for Two-Way Tables. This option has no effect for one-way tables or for tables that are displayed in list format (which you can request by specifying the LIST option).
requests Fisher’s exact test for tables that are larger than . (For tables, the CHISQ option provides Fisher’s exact test.) This test is also known as the Freeman-Halton test. See the sections Fisher’s Exact Test and Exact Statistics for more information.
If you omit the CHISQ option in the TABLES statement, the FISHER option invokes CHISQ. You can also request Fisher’s exact test by specifying the FISHER option in the EXACT statement.
Note: PROC FREQ computes exact tests by using fast and efficient algorithms that are superior to direct enumeration. Exact tests are appropriate when a data set is small, sparse, skewed, or heavily tied. For some large problems, computation of exact tests might require a substantial amount of time and memory. Consider using asymptotic tests for such problems. Alternatively, when asymptotic methods might not be sufficient for such large problems, consider using Monte Carlo estimation of exact p-values. You can request Monte Carlo estimation by specifying the MC computation-option in the EXACT statement. See the section Computational Resources for more information.
specifies a format for the following crosstabulation table cell values: frequency, expected frequency, and deviation. PROC FREQ also uses the specified format to display the row and column total frequencies and the overall total frequency in crosstabulation tables.
You can specify any standard SAS numeric format or a numeric format defined with the FORMAT procedure. The format length must not exceed 24. If you omit the FORMAT= option, by default PROC FREQ uses the BEST6. format to display frequencies less than 1E6, and the BEST7. format otherwise.
The FORMAT= option applies only to crosstabulation tables displayed in the default format. It does not apply to crosstabulation tables produced with the LIST or CROSSLIST option.
To change display formats in any FREQ table, you can use PROC TEMPLATE. See the chapter "The TEMPLATE Procedure" in the SAS Output Delivery System: User's Guide for more information.
requests the Gail-Simon test for qualitative interaction, which applies to stratified tables. See the section Gail-Simon Test for Qualitative Interactions for details.
The COLUMN= option specifies the column of the risk differences to use in computing the Gail-Simon test. By default, PROC FREQ uses column 1 risk differences. If you specify COLUMN=2, PROC FREQ uses column 2 risk differences.
requests the Jonckheere-Terpstra test. See the section Jonckheere-Terpstra Test for details. To request exact p-values for the Jonckheere-Terpstra test, specify the JT option in the EXACT statement. See the section Exact Statistics for more information.
displays two-way and multiway tables by using a list format instead of the default crosstabulation cell format. This option displays an entire multiway table in one table, instead of displaying a separate two-way table for each stratum. For more information, see the section Two-Way and Multiway Tables.
The LIST option is not available when you request tests and statistics; you must use the standard crosstabulation table display or the CROSSLIST display when you request tests and statistics.
specifies the maximum number of variable levels to display in one-way frequency tables. The value of n must be a positive integer. PROC FREQ displays the first n variable levels, matching the order in which the levels appear in the one-way frequency table. (The ORDER= option controls the order of the variable levels. By default, ORDER=INTERNAL, which orders the variable levels by unformatted value.)
The MAXLEVELS= option also applies to one-way frequency plots, which you can request by specifying the PLOTS=FREQPLOT option when ODS Graphics is enabled.
If you specify the MISSPRINT option to display missing levels in the frequency table, the MAXLEVELS= option displays the first n nonmissing levels.
The MAXLEVELS= option does not apply to the OUT= output data set, which includes all variable levels. The MAXLEVELS= option does not affect the computation of percentages, statistics, or tests for the one-way table; these values are based on the complete table.
requests measures of association and their asymptotic standard errors. This option provides the following measures: gamma, Kendall’s tau-b, Stuart’s tau-c, Somers’ , Somers’ , Pearson and Spearman correlation coefficients, lambda (symmetric and asymmetric), and uncertainty coefficients (symmetric and asymmetric). If you specify the CL option in parentheses after the MEASURES option, PROC FREQ provides confidence limits for the measures of association. For more information, see the section Measures of Association.
For tables, the MEASURES option also provides the odds ratio, column 1 relative risk, column 2 relative risk, and their asymptotic Wald confidence limits. You can request the odds ratio and relative risks separately (without the other measures of association) by specifying the RELRISK option. You can request confidence limits for the odds ratio by specifying the OR(CL=) option.
You can use the TEST statement to request asymptotic tests for the following measures of association: gamma, Kendall’s tau-b, Stuart’s tau-c, Somers’ , Somers’ , and Pearson and Spearman correlation coefficients. You can use the EXACT statement to request exact confidence limits for the odds ratio, exact unconditional confidence limits for the relative risks, and exact tests for the following measures of association: Kendall’s tau-b, Stuart’s tau-c, Somers’ and , and Pearson and Spearman correlation coefficients. For more information, see the descriptions of the TEST and EXACT statements and the section Exact Statistics.
treats missing values as a valid nonmissing level for all TABLES variables. The MISSING option displays the missing levels in frequency and crosstabulation tables and includes them in all calculations of percentages, tests, and measures.
By default, if you do not specify the MISSING or MISSPRINT option, an observation is excluded from a table if it has a missing value for any of the variables in the TABLES request. When PROC FREQ excludes observations with missing values, it displays the total frequency of missing observations below the table. See the section Missing Values for more information.
displays missing value frequencies in frequency and crosstabulation tables but does not include the missing value frequencies in any computations of percentages, tests, or measures.
By default, if you do not specify the MISSING or MISSPRINT option, an observation is excluded from a table if it has a missing value for any of the variables in the TABLES request. When PROC FREQ excludes observations with missing values, it displays the total frequency of missing observations below the table. See the section Missing Values for more information.
suppresses the display of column percentages in crosstabulation table cells.
suppresses the display of cumulative frequencies and percentages in one-way frequency tables. The NOCUM option also suppresses the display of cumulative frequencies and percentages in crosstabulation tables in list format, which you request with the LIST option.
suppresses the display of cell frequencies in crosstabulation tables. The NOFREQ option also suppresses row total frequencies. This option has no effect for one-way tables or for crosstabulation tables in list format, which you request with the LIST option.
suppresses the display of overall percentages in crosstabulation tables. These percentages include the cell percentages of the total (two-way) table frequency, as well as the row and column percentages of the total table frequency. To suppress the display of cell percentages of row or column totals, use the NOROW or NOCOL option, respectively.
For one-way frequency tables and crosstabulation tables in list format, the NOPERCENT option suppresses the display of percentages and cumulative percentages.
suppresses the display of frequency and crosstabulation tables but displays all requested tests and statistics. To suppress the display of all output, including tests and statistics, use the NOPRINT option in the PROC FREQ statement.
suppresses the display of row percentages in crosstabulation table cells.
suppresses the display of cells with a zero frequency count in LIST output and omits them from the OUT= data set. The NOSPARSE option applies when you specify the ZEROS option in the WEIGHT statement to include observations with zero weights. By default, the ZEROS option invokes the SPARSE option, which displays table cells with a zero frequency count in the LIST output and includes them in the OUT= data set. See the description of the ZEROS option for more information.
The NOSPARSE option also suppresses the display of variable levels with zero frequency in CROSSLIST tables. By default for CROSSLIST tables, PROC FREQ displays all levels of the column variable within each level of the row variable, including any column variable levels with zero frequency for that row. For multiway tables displayed with the CROSSLIST option, the procedure displays all levels of the row variable for each stratum of the table by default, including any row variable levels with zero frequency for the stratum.
suppresses the log warning message for the validity of the asymptotic Pearson chi-square test. By default, PROC FREQ provides a validity warning for the asymptotic Pearson chi-square test when more than 20cells have expected frequencies that are less than 5. This warning message appears in the log if you specify the NOPRINT option in the PROC FREQ statement,
The NOWARN option is equivalent to the CHISQ(WARN=NOLOG) option. You can also use the CHISQ(WARN=) option to suppress the warning message in the display and to request a warning variable in the chi-square ODS output data set or in the OUTPUT data set.
requests the odds ratio and confidence limits for tables. You can specify one or more types of confidence limits, which include exact, score, and Wald confidence limits. When you specify only one confidence limit type, you can omit the parentheses around the request.
PROC FREQ displays the confidence limits in the "Odds Ratio Confidence Limits" table. Specifying the OR option without the CL= option is equivalent to specifying the RELRISK option, which produces the "Odds Ratio and Relative Risks" table. For more information, see the description of the RELRISK option. When you specify the OR(CL=) option, PROC FREQ does not produce the "Odds Ratio and Relative Risks" table unless you also specify the RELRISK or MEASURES option.
The ALPHA= option determines the confidence level; by default, ALPHA=0.05, which produces 95% confidence limits for the odds ratio.
You can specify the following types:
displays exact confidence limits for the odds ratio in the "Confidence Limits for the Odds Ratio" table. You must also request computation of the exact confidence limits by specifying the OR option in the EXACT statement. For more information, see the section Exact Confidence Limits for the Odds Ratio.
requests score confidence limits for the odds ratio. For more information, see the section Score Confidence Limits for the Odds Ratio. If you specify CORRECT=NO, PROC FREQ provides the uncorrected form of the confidence limits.
requests asymptotic Wald confidence limits for the odds ratio. For more information, see the section Odds Ratio and Relative Risks for 2 x 2 Tables.
names an output data set that contains frequency or crosstabulation table counts and percentages. If more than one table request
appears in the TABLES statement, the contents of the OUT= data set correspond to the last table request in the TABLES statement.
The OUT= data set variable COUNT
contains the frequencies and the variable PERCENT
contains the percentages. See the section Output Data Sets for details. You can specify the following options to include additional information in the OUT= data set: OUTCUM
, OUTEXPECT
, and OUTPCT
.
includes cumulative frequencies and cumulative percentages in the OUT=
data set for one-way tables. The variable CUM_FREQ
contains the cumulative frequencies, and the variable CUM_PCT
contains the cumulative percentages. See the section Output Data Sets for details. The OUTCUM option has no effect for two-way or multiway tables.
includes expected cell frequencies in the OUT=
data set for crosstabulation tables. The variable EXPECTED
contains the expected cell frequencies. See the section Output Data Sets for details. The EXPECTED option has no effect for one-way tables.
includes the following additional variables in the OUT= data set for crosstabulation tables:
PCT_COL
percentage of column frequency
PCT_ROW
percentage of row frequency
PCT_TABL
percentage of stratum (two-way table) frequency, for n-way tables where n > 2
See the section Output Data Sets for details. The OUTPCT option has no effect for one-way tables.
requests the polychoric correlation coefficient and its asymptotic standard error. For tables, this statistic is more commonly known as the tetrachoric correlation coefficient, and it is labeled as such in the displayed output. For more information, see the section Polychoric Correlation.
If you also specify the CL or MEASURES(CL) option, PROC FREQ provides confidence limits for the polychoric correlation. If you specify the PLCORR option in the TEST statement, the procedure provides Wald and likelihood ratio tests for the polychoric correlation. The PLCORR option invokes the MEASURES option.
You can specify the following options:
specifies the convergence criterion for computing the polychoric correlation. The convergence criterion value must be a positive number. By default, CONVERGE=0.0001. Iterative computation of the polychoric correlation stops when the convergence measure falls below value or when the number of iterations exceeds the MAXITER= number, whichever happens first. For parameter values that are less than 0.01, PROC FREQ evaluates convergence by using the absolute difference instead of the relative difference. For more information, see the section Polychoric Correlation.
specifies the maximum number of iterations for computing the polychoric correlation. The value of number must be a positive integer. By default, MAXITER=20. Iterative computation of the polychoric correlation stops when the number of iterations exceeds the maximum number or when the convergence measure falls below the CONVERGE= value, whichever happens first. For more information, see the section Polychoric Correlation.
controls the plots that are produced through ODS Graphics. Plot-requests identify the plots, and plot-options control the appearance and content of the plots. You can specify plot-options in parentheses after a plot-request. A global-plot-option applies to all plots for which it is available unless it is altered by a specific plot-option. You can specify global-plot-options in parentheses after the PLOTS option.
When you specify only one plot-request, you can omit the parentheses around the request. For example:
plots=all plots=freqplot plots=(freqplot oddsratioplot) plots(only)=(cumfreqplot deviationplot)
ODS Graphics must be enabled before plots can be requested. For example:
ods graphics on; proc freq; tables treatment*response / chisq plots=freqplot; weight wt; run; ods graphics off;
For more information about enabling and disabling ODS Graphics, see the section Enabling and Disabling ODS Graphics in Chapter 21: Statistical Graphics Using ODS.
If ODS Graphics is enabled but you do not specify the PLOTS= option, PROC FREQ produces all plots that are associated with the analyses that you request, with the exception of the frequency, cumulative frequency, and mosaic plots. To produce a frequency plot or cumulative frequency plot when ODS Graphics is enabled, you must specify the FREQPLOT or CUMFREQPLOT plot-request, respectively, in the PLOTS= option, or you must specify the PLOTS=ALL option. To produce a mosaic plot when ODS Graphics is enabled, you must specify the MOSAICPLOT plot-request in the PLOTS= option, or you must specify the PLOTS=ALL option.
PROC FREQ produces the remaining plots (listed in Table 40.11) by default when you request the corresponding TABLES statement options. You can suppress default plots and request specific plots by using the PLOTS(ONLY)= option; PLOTS(ONLY)=(plot-requests) produces only the plots that are specified as plot-requests. You can suppress all plots by specifying the PLOTS=NONE option. The PLOTS option has no effect when you specify the NOPRINT option in the PROC FREQ statement.
Plot Requests
Table 40.11 lists the available plot-requests together with their required TABLES statement options. Descriptions of the plot-requests follow the table in alphabetical order.
Table 40.11: Plot Requests
Plot Request |
Description |
Required TABLES Statement Option |
---|---|---|
Agreement plot |
AGREE ( table) |
|
All plots |
None |
|
Cumulative frequency plot |
One-way table request |
|
Deviation plot |
CHISQ (one-way table) |
|
Frequency plot |
Any table request |
|
Kappa plot |
AGREE ( table) |
|
Mosaic plot |
Two-way or multiway table request |
|
No plots |
None |
|
Odds ratio plot |
||
Relative risk plot |
||
Risk difference plot |
RISKDIFF ( table) |
|
Weighted kappa plot |
AGREE ( table, ) |
You can specify the following plot-requests:
requests an agreement plot (Bangdiwala and Bryan, 1987), An agreement plot displays the strength of agreement in a two-way table, where the row and column variables represent two independent ratings of n subjects. For information about agreement plots, see Bangdiwala (1988), Bangdiwala et al. (2008), and Friendly (2000, Section 3.7.2).
To produce an agreement plot, you must also specify the AGREE option in the TABLES statement. Agreement statistics and plots are available for two-way square tables, where the number of rows equals the number of columns.
Table 40.12 lists the plot-options that are available for agreement plots. For descriptions of the plot-options, see the subsection "Plot Options" .
Table 40.12: Plot Options for AGREEPLOT
Plot Option |
Description |
Values |
---|---|---|
Legend |
NO or YES |
|
Partial agreement |
NO or YES |
|
Frequency scale |
NO or YES |
|
Statistics |
None |
|
Default |
If you specify the STATS
plot-option, the agreement plot displays the values of the kappa coefficient, the weighted kappa coefficient, the measure (Bangdiwala and Bryan, 1987), and the sample size. PROC FREQ stores these statistics in an ODS table named BnMeasure
, which is not displayed. For more information, see the section ODS Table Names.
requests all plots that are associated with the specified analyses. Table 40.11 lists the available plot-requests and the corresponding analysis options. If you specify the PLOTS=ALL option, PROC FREQ produces the frequency, cumulative frequency, and mosaic plots that are associated with the tables that you request. (These plots are not produced by default when ODS Graphics is enabled.)
requests a plot of cumulative frequencies. Cumulative frequency plots are available for one-way frequency tables.
To produce a cumulative frequency plot, you must specify the CUMFREQPLOT plot-request in the PLOTS= option, or you must specify the PLOTS=ALL option. PROC FREQ does not produce cumulative frequency plots by default when ODS Graphics is enabled.
Table 40.13 lists the plot-options that are available for cumulative frequency plots. For descriptions of the plot-options, see the subsection "Plot Options" .
requests a plot of relative deviations from expected frequencies. Deviation plots are available for chi-square analysis of one-way frequency tables. To produce a deviation plot, you must also specify the CHISQ option in the TABLES statement for a one-way frequency table.
Table 40.14 lists the plot-options that are available for deviation plots. For descriptions of the plot-options, see the subsection "Plot Options" .
requests a frequency plot. Frequency plots are available for frequency and crosstabulation tables. For multiway crosstabulation tables, PROC FREQ provides a two-way frequency plot for each stratum (two-way table).
To produce a frequency plot, you must specify the FREQPLOT plot-request in the PLOTS= option, or you must specify the PLOTS=ALL option. PROC FREQ does not produce frequency plots by default when ODS Graphics is enabled.
Table 40.15 lists the plot-options that are available for frequency plots. For descriptions of the plot-options, see the subsection "Plot Options" .
Table 40.15: Plot Options for FREQPLOT
Plot Option |
Description |
Values |
---|---|---|
Primary group |
COLUMN or ROW |
|
Sections per panel |
Number (4) |
|
Orientation |
HORIZONTAL or VERTICAL |
|
Scale |
FREQ, GROUPPERCENT, |
|
LOG, PERCENT, SQRT |
||
Two-way layout |
CLUSTER, GROUPHORIZONTAL, |
|
GROUPVERTICAL, or STACKED |
||
Type |
BARCHART or DOTPLOT |
|
Default |
||
For two-way tables |
You can specify the following plot-options for all frequency plots: ORIENT= , SCALE= , and TYPE= . You can specify the following plot-options for frequency plots of two-way (and multiway) tables: GROUPBY= , NPANELPOS= , and TWOWAY= . The NPANELPOS= plot-option is not available with the TWOWAY=CLUSTER or TWOWAY=STACKED layout, which is always displayed in a single panel.
By default, PROC FREQ displays frequency plots as bar charts. To display frequency plots as dot plots, specify TYPE=DOTPLOT . To plot percentages instead of frequencies, specify SCALE=PERCENT . For two-way tables, there are four frequency plot layouts available, which you can request by specifying the TWOWAY= plot-option. For more information, see the subsection "Plot Options" .
By default, graph cells in a two-way layout are first grouped by column variable levels; row variable levels are then displayed within the column variable levels. To group first by row variable levels, specify GROUPBY=ROW .
requests a plot of kappa statistics along with confidence limits. Kappa plots are available for multiway square tables and display the kappa statistic (with confidence limits) for each two-way table (stratum). Kappa plots also display the overall kappa statistic unless you specify the COMMON=NO plot-option. To produce a kappa plot, you must specify the AGREE option in the TABLES statement to compute kappa statistics.
Table 40.16 lists the plot-options that are available for kappa plots. For descriptions of the plot-options, see the subsection "Plot Options" .
Table 40.16: Plot Options for KAPPAPLOT and WTKAPPAPLOT
Plot Option |
Description |
Values |
---|---|---|
Error bar type |
BAR, LINE, LINEARROW, |
|
SERIF, or SERIFARROW |
||
Overall kappa |
NO or YES |
|
Statistics per graphic |
Number (all) |
|
Order of two-way levels |
ASCENDING or DESCENDING |
|
Range to display |
Values or CLIP |
|
Statistic values |
None |
|
Default |
requests a mosaic plot. Mosaic plots are available for two-way and multiway crosstabulation tables; for multiway tables, PROC FREQ provides a mosaic plot for each two-way table (stratum).
To produce a mosaic plot, you must specify the MOSAICPLOT plot-request in the PLOTS= option, or you must specify the PLOTS=ALL option. PROC FREQ does not produce mosaic plots by default when ODS Graphics is enabled.
Mosaic plots display tiles that correspond to the crosstabulation table cells. The areas of the tiles are proportional to the frequencies of the table cells. The column variable is displayed on the X axis, and the tile widths are proportional to the relative frequencies of the column variable levels. The row variable is displayed on the Y axis, and the tile heights are proportional to the relative frequencies of the row levels within column levels. For more information, see Friendly (2000).
By default, the colors of the tiles correspond to the row variable levels. If you specify the COLORSTAT= plot-option, the tiles are colored according to the values of the Pearson or standardized residuals.
You can specify the following plot-options:
colors the mosaic plot tiles according to the values of residuals. If you specify COLORSTAT=PEARSONRES, the tiles are colored according to the Pearson residuals of the corresponding table cells. For more information, see the section Pearson Chi-Square Test for Two-Way Tables. If you specify COLORSTAT=STDRES, the tiles are colored according to the standardized residuals of the corresponding table cells. For more information, see the section Standardized Residuals. You can display the Pearson or standardized residuals in the CROSSLIST table by specifying the CROSSLIST(PEARSONRES) or CROSSLIST(STDRES) option, respectively.
produces a square mosaic plot, where the height of the Y axis equals the width of the X axis. In a square mosaic plot, the scale of the relative frequencies is the same on both axes. By default, PROC FREQ produces a rectangular mosaic plot.
requests a plot of odds ratios along with confidence limits. Odds ratio plots are available for multiway tables and display the odds ratio (with confidence limits) for each table (stratum). To produce an odds ratio plot, you must also specify the MEASURES , OR(CL=) , or RELRISK option in the TABLES statement to compute the odds ratios.
Table 40.17 lists the plot-options that are available for odds ratio plots. For descriptions of the plot-options, see the subsection "Plot Options" .
Table 40.17: Plot Options for ODDSRATIOPLOT, RELRISKPLOT, and RISKDIFFPLOT
Plot Option |
Description |
Values |
---|---|---|
Confidence limit type |
Type |
|
Error bar type |
BAR, LINE, LINEARROW, |
|
SERIF, or SERIFARROW |
||
Common value |
NO or YES |
|
Risk column |
1 or 2 |
|
Axis scale |
2, E, or 10 |
|
Statistics per graphic |
Number (all) |
|
Order of two-way levels |
ASCENDING or DESCENDING |
|
Range to display |
Values or CLIP |
|
Statistic values |
None |
|
Default |
||
Available for RELRISKPLOT and RISKDIFFPLOT |
||
Available for ODDSRATIOPLOT and RELRISKPLOT |
You can specify one of the following confidence limit types for the odds ratio plot: exact (CL=EXACT ), score (CL=SCORE ), or Wald (CL=WALD ). By default, the odds ratio plot displays Wald confidence limits. For more information, see the descriptions of the CL= plot-option and the OR(CL=) option.
To display exact or score confidence limits in the odds ratio plot, you must also request their computation. You can request exact confidence limits for the odds ratio by specifying the OR option in the EXACT statement. You can request score confidence limits for the odds ratio by specifying the OR(CL=SCORE) option in the TABLES statement.
When CL=WALD or CL=EXACT, the odds ratio plot displays the common odds ratio by default when it is available. To compute the common odds ratio along with Wald confidence limits, specify the CMH option in the TABLES statement. To compute the common odds ratio along with exact confidence limits, specify the COMOR option in the EXACT statement. To suppress display of the common odds ratio, specify COMMON=NO . When CL=SCORE , the odds ratio plot does not display the common odds ratio.
requests a plot of relative risks along with confidence limits. Relative risk plots are available for multiway tables and display the relative risk (with confidence limits) for each table (stratum). To produce a relative risk plot, you must also specify the MEASURES or RELRISK option in the TABLES statement to compute relative risks.
Table 40.17 lists the plot-options that are available for relative risk plots. For descriptions of the plot-options, see the subsection "Plot Options" .
You can specify one of the following confidence limit types for the relative risk plot: exact (CL=EXACT ), score (CL=SCORE ), or Wald (CL=WALD ). By default, the relative risk plot displays Wald confidence limits. For more information, see the descriptions of the CL= plot-option and the RELRISK(CL=) option.
To display exact or score confidence limits in the relative risk plot, you must also request their computation. To request exact confidence limits for the relative risk, specify the RELRISK option in the EXACT statement. To request score confidence limits for the relative risk, specify the RELRISK(CL=SCORE) option in the TABLES statement. The risk column that you specify for the confidence limits must match the risk column that you specify for the plot.
The relative risk plot displays the common relative risk by default when you specify CL=WALD and the CMH option in the TABLES statement. To suppress display of the common relative risk, specify COMMON=NO . When you specify CL=EXACT or CL=SCORE , the relative risk plot does not display the common relative risk.
requests a plot of risk (proportion) differences along with confidence limits. Risk difference plots are available for multiway tables and display the risk difference (with confidence limits) for each table (stratum). To produce a risk difference plot, you must also specify the RISKDIFF option in the TABLES statement to compute risk differences.
Table 40.17 lists the plot-options that are available for risk difference plots. For descriptions of the plot-options, see the subsection "Plot Options" .
You can specify the CL= plot-option to display one of the following confidence limit types in the risk difference plot: Agresti-Caffo, exact, Hauck-Anderson, Miettinen-Nurminen (score), Newcombe, and Wald. By default, the plot displays Wald confidence limits for the risk difference. For more information, see the descriptions of the CL= plot-option and the RISKDIFF(CL=) option.
To display exact confidence limits in the risk difference plot, you must also request their computation by specifying the RISKDIFF option in the EXACT statement. The risk column that you specify for the confidence limits must match the risk column that you specify for the plot.
By default, the risk difference plot displays the common risk difference when you specify the RISKDIFF(COMMON) option and one of the following confidence limit types in the CL= plot-option: Miettinen-Nurminen (score) (CL=MN ), Newcombe (CL=NEWCOMBE ), or Wald (CL=WALD ). To suppress display of the common risk difference, specify COMMON=NO .
requests a plot of weighted kappa coefficients along with confidence limits. Weighted kappa plots are available for multiway square tables and display the weighted kappa coefficient (with confidence limits) for each two-way table (stratum). Weighted kappa plots also display the overall weighted kappa coefficient unless you specify the COMMON=NO plot-option.
To produce a weighted kappa plot, you must specify the AGREE option in the TABLES statement to compute weighted kappa coefficients, and the table dimension must be greater than 1.
Table 40.16 lists the plot-options that are available for weighted kappa plots. For descriptions of the plot-options, see the subsection "Plot Options" .
Global Plot Options
A global-plot-option applies to all plots for which the option is available unless it is altered by an individual plot-option. You can specify global-plot-options in parentheses after the PLOTS option. For example:
plots(order=ascending stats)=(riskdiffplot oddsratioplot) plots(only)=freqplot
The following plot-options are available as global-plot-options: CLDISPLAY= , COLUMN= , COMMON= , EXACT , LOGBASE= , NPANELPOS= , ORDER= , ORIENT= , RANGE= , SCALE= , STATS , and TYPE= . For descriptions of these plot-options, see the subsection "Plot Options" .
In addition to these plot-options, you can specify the following global-plot-option:
You can specify the following plot-options in parentheses after a plot-request:
specifies the type of confidence limits to display. You can specify the CL= plot-option when you specify the following plot-requests: ODDSRATIOPLOT , RELRISKPLOT , and RISKDIFFPLOT .
For odds ratio plots (ODDSRATIOPLOT ), the available confidence limit types include exact, score, and Wald, which you can request by specifying CL=EXACT, CL=SCORE, and CL=WALD, respectively. For more information, see the description of the OR(CL=) option and the section Odds Ratio and Relative Risks for 2 x 2 Tables. The default is CL=WALD. When you specify CL=EXACT to display exact confidence limits, you must also request computation of exact confidence limits by specifying the OR option in the EXACT statement. When you specify CL=SCORE, you must also request computation of score confidence limits by specifying the OR(CL=SCORE) option in the TABLES statement.
For relative risk plots (RELRISKPLOT ), the available confidence limit types include exact, score, and Wald, which you can request by specifying CL=EXACT, CL=SCORE, and CL=WALD, respectively. For more information, see the description of the RELRISK(CL=) option and the section Relative Risks. The default is CL=WALD. When you specify CL=EXACT to display exact confidence limits, you must also request computation of exact confidence limits by specifying the RELRISK option in the EXACT statement. When you specify CL=SCORE, you must also request computation of score confidence limits by specifying the RELRISK(CL=SCORE) option in the TABLES statement.
For risk difference plots (RISKDIFFPLOT ), the available confidence limit types include the following: Agresti-Caffo (CL=AC), exact (CL=EXACT), Hauck-Anderson (CL=HA), Miettinen-Nurminen (score) (CL=MN), Newcombe (CL=NEWCOMBE), and Wald (CL=WALD). For more information, see the description of the RISKDIFF(CL=) option and the section Risk Difference Confidence Limits. If you specify CL=EXACT to display exact confidence limits in the plot, you must also request computation of exact confidence limits by specifying the RISKDIFF option in the EXACT statement.
controls the appearance of the confidence limit error bars. You can specify the CLDISPLAY= plot-option when you specify the following plot-requests: KAPPAPLOT , ODDSRATIOPLOT , RELRISKPLOT , RISKDIFFPLOT , and WTKAPPAPLOT .
The default is CLDISPLAY=SERIF, which displays the confidence limits as lines with serifs. CLDISPLAY=LINE displays the confidence limits as plain lines without serifs. The CLDISPLAY=SERIFARROW and CLDISPLAY=LINEARROW plot-options display arrowheads on any error bars that are clipped by the RANGE= plot-option; if an entire error bar is cut from the plot, the plot displays an arrowhead that points toward the statistic.
CLDISPLAY=BAR displays the confidence limits as bars. By default, the width of the bars equals the size of the marker for the estimate. You can control the width of the bars and the size of the marker by specifying the value of width as a percentage of the distance between bars, . The bar might disappear when the value of width is very small.
specifies the table column to use to compute the risks (proportion) for the relative risk plot (RELRISKPLOT ) and the risk difference plot (RISKDIFFPLOT ). If you specify COLUMN=1, the plot displays the column 1 relative risks or the column 1 risk differences. Similarly, if you specify COLUMN=2, the plot displays the column 2 relative risks or risk differences.
For relative risk plots, the default is COLUMN=1. For risk difference plots, the default is COLUMN=1 if you request computation of both column 1 and column 2 risk differences by specifying the RISKDIFF option. If you request computation of only the column 1 (or column 2) risk differences by specifying the RISKDIFF(COLUMN=1) (or RISKDIFF(COLUMN=2) ) option, by default the risk difference plot displays the risk differences for the column that you specify.
controls the display of the common (overall) statistic in plots that display stratum (two-way table) statistics for multiway tables. You can specify the COMMON= plot-option when you specify the following plot-requests: KAPPAPLOT , ODDSRATIOPLOT , RELRISKPLOT , RISKDIFFPLOT , and WTKAPPAPLOT .
COMMON=NO suppresses display of the common statistic and its confidence limits. By default, COMMON=YES, which displays the common statistic and its confidence limits when these values are available. For more information, see the descriptions of the plot-requests.
requests display of exact confidence limits instead of asymptotic confidence limits. You can specify the EXACT plot-option when you specify the following plot-requests: ODDSRATIOPLOT , RELRISKPLOT , and RISKDIFFPLOT . The EXACT plot-option is equivalent to the CL=EXACT plot-option.
When you specify the EXACT plot-option, you must also request computation of exact confidence limits by specifying the appropriate statistic-option in the EXACT statement.
specifies the primary grouping for two-way frequency plots, which you can request by specifying the FREQPLOT plot=request. The default is GROUPBY=COLUMN, which groups graph cells first by column variable and displays row variable levels within column variable levels. You can specify GROUPBY=ROW to group first by row variable. In two-way and multiway table requests, the column variable is the last variable specified and forms the columns of the crosstabulation table. The row variable is the next-to-last variable specified and forms the rows of the table.
By default for a bar chart that is displayed in the TWOWAY=STACKED layout, bars correspond to the column variable levels, and row levels are displayed (stacked) within each column bar. By default for a bar chart that is displayed in the TWOWAY=CLUSTER layout, bars are first grouped by column variable levels, and row levels are displayed as adjacent bars within each column-level group. You can reverse the default row and column variable grouping by specifying GROUPBY=ROW.
applies to the odds ratio plot (ODDSRATIOPLOT ) and the relative risk plot (RELRISKPLOT ). This plot-option displays the odds ratio or relative risk axis on the log scale that you specify.
applies to the agreement plot (AGREEPLOT ). LEGEND=NO suppresses the legend that identifies the areas of exact and partial agreement. The default is LEGEND=YES.
applies to the deviation plot (DEVIATIONPLOT ). NOSTAT suppresses the chi-square p-value that deviation plot displays by default.
divides the plot into multiple panels that display at most statistics or sections.
If n is positive, the number of statistics or sections per panel is balanced; if n is negative, the number of statistics per panel is not balanced. For example, suppose you want to display 21 odds ratios. NPANELPOS=20 displays two panels, the first with 11 odds ratios and the second with 10 odds ratios; NPANELPOS=–20 displays 20 odds ratios in the first panel but only 1 odds ratio in the second panel. This plot-option is available for all plots except mosaic plots and one-way weighted frequency plots.
For two-way frequency plots (FREQPLOT ), NPANELPOS=n requests that panels display at most sections, where sections correspond to row or column variable levels, depending on the type of plot and the grouping. By default, n=4 and each panel includes at most four sections. This plot-option applies to two-way plots that are displayed in the TWOWAY=GROUPVERTICAL or TWOWAY=GROUPHORIZONTAL layout. The NPANELPOS= plot-option does not apply to the TWOWAY=CLUSTER and TWOWAY=STACKED layouts, which are always displayed in a single panel.
For plots that display statistics along with confidence limits, NPANELPOS=n requests that panels display at most statistics. By default, n=0 and all statistics are displayed in a single panel. This plot-option applies to the following plots: KAPPAPLOT , ORPLOT , RELRISKPLOT , RISKDIFFPLOT , and WTKAPPAPLOT .
displays the two-way table (strata) statistics in order of the statistic value. You can specify the ORDER= plot-option when you specify the following plot-requests: KAPPAPLOT , ODDSRATIOPLOT , RELRISKPLOT , RISKDIFFPLOT , and WTKAPPAPLOT .
If you specify ORDER=ASCENDING or ORDER=DESCENDING, the plot displays the statistics in ascending or descending order, respectively. By default, the order of the statistics in the plot matches the order that the two-way table strata appear in the multiway table display.
controls the orientation of the plot. You can specify the ORIENT= plot-option when you specify the following plot-requests: CUMFREQPLOT , DEVIATIONPLOT , and FREQPLOT .
ORIENT=HORIZONTAL places the variable levels on the Y axis and the frequencies, percentages, or statistic values on the X axis. ORIENT=VERTICAL places the variable levels on the X axis. The default orientation is ORIENT=VERTICAL for bar charts (TYPE=BARCHART ) and ORIENT=HORIZONTAL for dot plots (TYPE=DOTPLOT ).
controls the display of partial agreement in the agreement plot (AGREEPLOT ). PARTIAL=NO suppresses the display of partial agreement. When you specify PARTIAL=NO, the agreement plot displays only exact agreement. Exact agreement includes the diagonal cells of the square table, where the row and column variable levels are the same. Partial agreement includes the adjacent off-diagonal table cells, where the row and column values are within one level of exact agreement. The default is PARTIAL=YES.
specifies the range of values to display. You can specify the RANGE= plot-option when you specify the following plot-requests: KAPPAPLOT , ODDSRATIOPLOT , RELRISKPLOT , RISKDIFFPLOT , and WTKAPPAPLOT .
If you specify RANGE=CLIP, the confidence limits are clipped and the display range is determined by the minimum and maximum values of the statistics. By default, the display range includes all confidence limits.
specifies the scale of the frequencies to display. This plot-option is available for frequency plots (FREQPLOT ) and cumulative frequency plots (CUMFREQPLOT ).
The default is SCALE=FREQ, which displays unscaled frequencies. SCALE=PERCENT displays percentages (relative frequencies) of the total frequency. SCALE=LOG displays log (base 10) frequencies. SCALE=SQRT displays square roots of the frequencies, producing a plot known as a rootogram.
SCALE=GROUPPERCENT is available for two-way frequency plots. This option displays the row or column percentages instead of the overall percentages (of the table frequency). By default (or when you specify the GROUPBY=COLUMN plot-option), SCALE=GROUPPERCENT displays the column percentages. If you specify the GROUPBY=ROW plot-option, the primary grouping of graph cells is by row variable level and the plot displays row percentages. For more information, see the description of the GROUPBY= plot-option.
controls the display of the cumulative frequency scale on the right side of the agreement plot (AGREEPLOT ). SHOWSCALE=NO suppresses the display of the scale. The default is SHOWSCALE=YES.
displays statistic values in the plot. For the following plot-requests, the STATS plot-option displays the statistics and their confidence limits on the right side of the plot: KAPPAPLOT , ODDSRATIOPLOT , RELRISKPLOT , RISKDIFFPLOT , and WTKAPPAPLOT .
For the agreement plot (AGREEPLOT
), the STATS plot-option displays the values of the kappa statistic, the weighted kappa statistic, the measure (Bangdiwala and Bryan, 1987), and the sample size. PROC FREQ stores these statistics in an ODS table named BnMeasure
, which is not displayed. For more information, see the section ODS Table Names.
If you do not request the STATS plot-option, these plots do not display the statistic values.
specifies the layout for two-way frequency plots.
All TWOWAY= layouts are available for bar charts (TYPE=BARCHART ). All TWOWAY= layouts except TWOWAY=CLUSTER are available for dot plots (TYPE=DOTPLOT ). The ORIENT= and GROUPBY= plot-options are available for all TWOWAY= layouts.
The default two-way layout is TWOWAY=GROUPVERTICAL, which produces a grouped plot that has a vertical common baseline. By default for bar charts (TYPE=BARCHART , ORIENT=VERTICAL ), the X axis displays column variable levels, and the Y axis displays frequencies. The plot includes a vertical (Y-axis) block for each row variable level. The relative positions of the graph cells in this plot layout are the same as the relative positions of the table cells in the crosstabulation table. You can reverse the default row and column grouping by specifying the GROUPBY=ROW plot-option.
The TWOWAY=GROUPHORIZONTAL layout produces a grouped plot that has a horizontal common baseline. By default (GROUPBY=COLUMN ), the plot displays a block on the X axis for each column variable level. Within each column-level block, the plot displays row variable levels.
The TWOWAY=STACKED layout produces stacked displays of frequencies. By default (GROUPBY=COLUMN ) in a stacked bar chart, the bars correspond to column variable levels, and row levels are stacked within each column level. By default in a stacked dot plot, the dotted lines correspond to column levels, and cell frequencies are plotted as data dots on the corresponding column line. The dot color identifies the row level.
The TWOWAY=CLUSTER layout, which is available only for bar charts, displays groups of adjacent bars. By default, the primary grouping is by column variable level, and row levels are displayed within each column level.
You can reverse the default row and column grouping in any layout by specifying the GROUPBY=ROW plot-option. The default is GROUPBY=COLUMN , which groups first by column variable.
specifies the plot type (format) of the frequency (FREQPLOT ), cumulative frequency (CUMFREQPLOT ), and deviation plots (DEVIATIONPLOT ). TYPE=BARCHART produces a bar chart and TYPE=DOTPLOT produces a dot plot. The default is TYPE=BARCHART.
displays the agreement weights that PROC FREQ uses to compute the weighted kappa coefficient. Agreement weights reflect the relative agreement between pairs of variable levels. By default, PROC FREQ uses the Cicchetti-Allison form of agreement weights. If you specify the AGREE(WT=FC) option, the procedure uses the Fleiss-Cohen form of agreement weights. For more information, see the section Weighted Kappa Coefficient.
This option has no effect unless you also specify the AGREE option to compute the weighted kappa coefficient. The PRINTKWTS option is equivalent to the AGREE(PRINTKWTS) option.
requests relative risk measures for tables. These measures include the odds ratio, the column 1 relative risk, and the column 2 relative risk. For more information, see the section Odds Ratio and Relative Risks for 2 x 2 Tables. By default, PROC FREQ displays the relative risk measures and their asymptotic Wald confidence limits in the "Odds Ratio and Relative Risks" table. You can also obtain this table by specifying the MEASURES option, which produces other measures of association in addition to the relative risks.
When you specify confidence limit types in the CL= relrisk-option, PROC FREQ displays the "Relative Risk Confidence Limits" table; PROC FREQ does not display the "Odds Ratio and Relative Risks" table unless you also specify the PRINTALL relrisk-option. You can request the "Odds Ratio Confidence Limits" table by specifying the OR(CL=) option.
You can specify the following relrisk-options:
specifies confidence limit types for the relative risk. You can specify one or more types of confidence limits. When you specify only one type, you can omit the parentheses around the request. When you specify the CL= relrisk-option, PROC FREQ displays the confidence limits in the "Relative Risk Confidence Limits" table.
(The ALPHA= option determines the level of the confidence limits that the CL= relrisk-option provides. By default, ALPHA=0.05, which produces 95% confidence limits for the relative risk.)
You can specify the following types:
displays exact unconditional confidence limits for the relative risk in the "Confidence Limits for the Relative Risk" table. You must also request computation of the exact confidence limits by specifying the RELRISK option in the EXACT statement. For more information, see the section Exact Unconditional Confidence Limits for the Relative Risk.
requests score confidence limits for the relative risk. For more information, see the section Score Confidence Limits for the Relative Risk. If you specify CORRECT=NO, PROC FREQ provides the uncorrected form of the confidence limits.
requests asymptotic Wald confidence limits for the relative risk. For more information, see the section Relative Risks.
If you do not specify the CL= relrisk-option, the RELRISK option displays Wald confidence limits for the odds ratio, column 1 relative risk, and column 2 relative risk in the "Odds Ratio and Relative Risks" table.
specifies the table column for which to compute the relative risk confidence limits (which you request by specifying the CL= relrisk-option). By default, COLUMN=1, which displays confidence limits for the column 1 relative risk in the "Relative Risk Confidence Limits" tables. This option has no effect on the "Odds Ratio and Relative Risks" table, which displays both column 1 and column 2 relative risks.
displays the "Odds Ratio and Relative Risks" table when you specify the CL= relrisk-option. (By default, PROC FREQ does not display this table when you specify the CL= relrisk-option.)
requests risks (binomial proportions) and risk differences for tables. By default, this option provides the row 1 risk, row 2 risk, total (overall) risk, and risk difference (row 1 – row 2), together with their asymptotic standard errors and Wald confidence limits; by default, this option also provides exact (Clopper-Pearson) confidence limits for the row 1, row 2, and total risks. You can request exact unconditional confidence limits for the risk difference by specifying the RISKDIFF option in the EXACT statement. For more information, see the section Risks and Risk Differences. PROC FREQ displays these results in the column 1 and column 2 "Risk Estimates" tables.
You can specify riskdiff-options in parentheses after the RISKDIFF option to request tests and additional confidence limits for the risk difference, in addition to estimates of the common risk difference for multiway tables. Table 40.18 summarizes the riskdiff-options.
The CL= riskdiff-option requests confidence limits for the risk difference. Available confidence limit types include Agresti-Caffo, exact unconditional, Hauck-Anderson, Miettinen-Nurminen (score), Newcombe, and Wald. Continuity-corrected Newcombe and Wald confidence limits are also available. You can request more than one type of confidence limits in the same analysis. PROC FREQ displays the confidence limits in the "Proportion (Risk) Difference Confidence Limits" table.
The CL=EXACT riskdiff-option displays exact unconditional confidence limits in the "Proportion (Risk) Difference Confidence Limits" table. When you specify CL=EXACT, you must also request computation of the exact confidence limits by specifying the RISKDIFF option in the EXACT statement.
The EQUIV , NONINF , and SUP riskdiff-options request tests of equivalence, noninferiority, and superiority, respectively, for the risk difference. Available test methods include Farrington-Manning (score), Hauck-Anderson, and Newcombe (hybrid-score), in addition to the Wald test.
As part of the noninferiority, superiority, and equivalence analyses, PROC FREQ provides null-based equivalence limits that have a confidence coefficient of % (Schuirmann, 1999). The ALPHA= option determines the confidence level; by default, ALPHA=0.05, which produces 90% equivalence limits for these analyses. For more information, see the sections Noninferiority Tests and Equivalence Tests.
Table 40.18: RISKDIFF (Proportion Difference) Options
Option |
Description |
---|---|
Specifies the risk column |
|
Requests common risk difference |
|
Requests continuity correction |
|
Suppresses default risk tables |
|
Request Confidence Limits |
|
Requests Agresti-Caffo confidence limits |
|
Displays exact confidence limits |
|
Requests Hauck-Anderson confidence limits |
|
Requests Miettinen-Nurminen confidence limits |
|
Requests Newcombe confidence limits |
|
Requests Wald confidence limits |
|
Request Tests |
|
Requests an equality test |
|
Requests an equivalence test |
|
Specifies the test margin |
|
Specifies the test method |
|
Requests a noninferiority test |
|
Requests a superiority test |
|
Specifies the test variance |
You can specify the following riskdiff-options in parentheses after the RISKDIFF option:
requests confidence limits for the risk difference. You can specify one or more types of confidence limits. When you specify only one type, you can omit the parentheses around the request. PROC FREQ displays the confidence limits in the "Proportion (Risk) Difference Confidence Limits" table.
The ALPHA= option determines the level of the confidence limits. By default, ALPHA=0.05, which produces 95% confidence limits for the risk difference.
You can specify the CL= riskdiff-option with or without requests for risk difference tests. The confidence limits that CL= produces do not depend on the tests that you request and do not use the value of the test margin (which you can specify in the MARGIN= riskdiff-option).
You can control the risk column for the confidence limits by specifying the COLUMN= riskdiff-option. If you do not specify COLUMN=, by default PROC FREQ provides confidence limits for the column 1 risk difference.
You can specify the following types:
requests Agresti-Caffo confidence limits for the risk difference. For more information, see the subsection "Agresti-Caffo Confidence Limits" in the section Risk Difference Confidence Limits.
displays exact unconditional confidence limits for the risk difference in the "Proportion (Risk) Difference Confidence Limits" table. You must also request computation of the exact confidence limits by specifying the RISKDIFF option in the EXACT statement.
PROC FREQ computes the confidence limits by inverting two separate one-sided exact tests (tail method). By default, the tests are based on the unstandardized risk difference. If you specify the RISKDIFF(METHOD=SCORE) option in the EXACT statement, the tests are based on the score statistic. For more information, see the RISKDIFF option in the EXACT statement and the section Exact Unconditional Confidence Limits for the Risk Difference.
By default, PROC FREQ also displays these exact confidence limits in the "Risk Estimates" table. You can suppress this table by specifying the NORISKS riskdiff-option.
requests Hauck-Anderson confidence limits for the risk difference. For more information, see the subsection "Hauck-Anderson Confidence Limits" in the section Risk Difference Confidence Limits.
requests Miettinen-Nurminen (score) confidence limits for the risk difference. For more information, see the subsection "Miettinen-Nurminen (Score) Confidence Limits" in the section Risk Difference Confidence Limits. By default, the Miettinen-Nurminen confidence limits include a bias correction factor (Miettinen and Nurminen, 1985; Newcombe and Nurminen, 2011). If you specify CL=MN(CORRECT=NO), PROC FREQ provides the uncorrected form of the confidence limits (Mee, 1984).
requests Newcombe hybrid-score confidence limits for the risk difference. If you specify CL=NEWCOMBE(CORRECT) or the CORRECT riskdiff-option, the Newcombe confidence limits include a continuity correction. For more information, see the subsection "Newcombe Confidence Limits" in the section Risk Difference Confidence Limits.
requests Wald confidence limits for the risk difference. If you specify CL=WALD(CORRECT) or the CORRECT riskdiff-option, the Wald confidence limits include a continuity correction. For more information, see the subsection "Wald Confidence Limits" in the section Risk Difference Confidence Limits.
specifies the table column for which to compute the risk difference tests (EQUAL , EQUIV , NONINF , and SUP ) and the risk difference confidence limits (which you request by specifying the CL= riskdiff-option). By default, COLUMN=1.
This option has no effect on the "Risk Estimates" table, which is produced for both column 1 and column 2. You can suppress the "Risk Estimates" table by specifying the NORISKS riskdiff-option.
requests estimates of the common (overall) risk difference for multiway tables. PROC FREQ produces Mantel-Haenszel and summary score estimates for the common risk difference, together with their confidence limits. For more information, see the section Common Risk Difference. If you specify the RISKDIFF(CL=NEWCOMBE) option, PROC FREQ also provides Newcombe confidence limits for the common risk difference. For more information, see the section Common Risk Difference.
If you do not specify the COLUMN= riskdiff-option, PROC FREQ provides the common risk difference for column 1 by default. If you specify COLUMN=2, PROC FREQ provides the common risk difference for column 2. COLUMN=BOTH does not apply to the common risk difference.
includes a continuity correction in the Wald confidence limits, Wald tests, and Newcombe confidence limits. For more information, see the section Risks and Risk Differences.
requests a test of the null hypothesis that the risk difference equals zero. This option provides an asymptotic Wald test of equality. If you specify the CORRECT riskdiff-option, the Wald test includes a continuity correction. If you specify the VAR=NULL riskdiff-option, the test uses the null-based variance instead of the sample variance. For more information, see the section Equality Test.
requests a test of equivalence for the risk difference. For more information, see the section Equivalence Tests. You can specify the test method in the METHOD= riskdiff-option, and you can specify the margins in the MARGIN= riskdiff-option. By default, METHOD=WALD and MARGIN=0.2.
specifies the margin for the noninferiority, superiority, and equivalence tests, which you request by specifying the NONINF , SUP , and EQUIV riskdiff-options, respectively. By default, MARGIN=0.2.
For noninferiority and superiority tests, specify a single value in the MARGIN= option. The value must be a positive number. You can specify value as a number between 0 and 1. Or you can specify value in percentage form as a number between 1 and 100, and PROC FREQ converts that number to a proportion. PROC FREQ treats the value 1 as 1%.
For an equivalence test, you can specify a single MARGIN= value, or you can specify both lower and upper values. If you specify a single value, it must be a positive number, as described previously. If you specify a single value for an equivalence test, PROC FREQ uses –value as the lower margin and value as the upper margin for the test. If you specify both lower and upper values for an equivalence test, you can specify them in proportion form as numbers between –1 or 1. Or you can specify them in percentage form as numbers between –100 and 100, and PROC FREQ converts the numbers to proportions. The value of lower must be less than the value of upper.
specifies the method for the noninferiority, superiority, and equivalence analyses, which you request by specifying the NONINF , SUP , and EQUIV riskdiff-options, respectively. By default, METHOD=WALD.
You can specify the following methods:
requests Farrington-Manning (score) tests and equivalence limits for the equivalence, noninferiority, and superiority analyses. For more information, see the subsection "Farrington-Manning (Score) Test" in the section Noninferiority Tests.
requests Hauck-Anderson tests and confidence limits for the equivalence, noninferiority, and superiority analyses. For more information, see the subsection "Hauck-Anderson Test" in the section Noninferiority Tests.
requests Newcombe (hybrid-score) confidence limits for the equivalence, noninferiority, and superiority analyses. If you specify the CORRECT riskdiff-option, the Newcombe confidence limits include a continuity correction. For more information, see the subsection "Newcombe Noninferiority Analysis" in the section Noninferiority Tests.
requests Wald tests and confidence limits for the equivalence, noninferiority, and superiority analyses. If you specify the CORRECT riskdiff-option, the Wald tests and confidence limits include a continuity correction. If you specify the VAR=NULL riskdiff-option, the tests use the null (test-based) variance instead of the sample variance. For more information, see the subsection "Wald Test" in the section Noninferiority Tests.
requests a test of noninferiority for the risk difference. For more information, see the section Noninferiority Tests. You can specify the test method in the METHOD= riskdiff-option, and you can specify the margin in the MARGIN= riskdiff-option. By default, METHOD=WALD and MARGIN=0.2.
suppresses display of the "Risk Estimates" tables, which the RISKDIFF option produces by default for column 1 and column 2. The "Risk Estimates" tables contain the risks and risk differences, together with their asymptotic standard errors, Wald confidence limits, and exact confidence limits.
requests a test of superiority for the risk difference. For more information, see the section Superiority Test. You can specify the test method in the METHOD= riskdiff-option, and you can specify the margin in the MARGIN= riskdiff-option. By default, METHOD=WALD and MARGIN=0.2.
specifies the type of variance to use in the Wald tests of noninferiority, superiority, equivalence, and equality. If you specify VAR=SAMPLE, PROC FREQ uses the sample variance. If you specify VAR=NULL, PROC FREQ uses a test-based variance that is computed by using the null hypothesis value of the risk difference. For more information, see the sections Equality Test and Noninferiority Tests. The default is VAR=SAMPLE.
specifies the type of row and column scores that PROC FREQ uses to compute the following statistics: Mantel-Haenszel chi-square, Pearson correlation, Cochran-Armitage test for trend, weighted kappa coefficient, and Cochran-Mantel-Haenszel statistics. The value of type can be one of the following:
MODRIDIT
RANK
RIDIT
TABLE
See the section Scores for descriptions of these score types.
If you do not specify the SCORES= option, PROC FREQ uses SCORES=TABLE by default. For character variables, the row and column TABLE scores are the row and column numbers. That is, the TABLE score is 1 for row 1, 2 for row 2, and so on. For numeric variables, the row and column TABLE scores equal the variable values. See the section Scores for details. Using MODRIDIT, RANK, or RIDIT scores yields nonparametric analyses.
You can use the SCOROUT option to display the row and column scores.
displays the row and column scores that PROC FREQ uses to compute score-based tests and statistics. You can specify the score type with the SCORES= option. See the section Scores for details.
The scores are computed and displayed only when PROC FREQ computes statistics for two-way tables. You can use ODS to store the scores in an output data set. See the section ODS Table Names for more information.
reports all possible combinations of the variable values for an n-way table when n > 1, even if a combination does not occur in the data. The SPARSE option applies only to crosstabulation tables displayed in LIST format and to the OUT= output data set. If you do not use the LIST or OUT= option, the SPARSE option has no effect.
When you specify the SPARSE and LIST options, PROC FREQ displays all combinations of variable values in the table listing, including those with a frequency count of zero. By default, without the SPARSE option, PROC FREQ does not display zero-frequency levels in LIST output. When you use the SPARSE and OUT= options, PROC FREQ includes empty crosstabulation table cells in the output data set. By default, PROC FREQ does not include zero-frequency table cells in the output data set.
See the section Missing Values for more information.
displays the percentage of the total multiway table frequency in crosstabulation tables for n-way tables, where n > 2. By default, PROC FREQ displays the percentage of the individual two-way table frequency but does not display the percentage of the total frequency for multiway crosstabulation tables. See the section Two-Way and Multiway Tables for more information.
The percentage of total multiway table frequency is displayed by default when you specify the LIST
option. It is also provided by default in the PERCENT
variable in the OUT=
output data set.
requests the Cochran-Armitage test for trend. The table must be or to compute the trend test. See the section Cochran-Armitage Test for Trend for details. To request exact p-values for the trend test, specify the TREND option in the EXACT statement. See the section Exact Statistics for more information.