This example shows how you can use PROC IRT to fit an item response theory model by using all the default settings. In this example, there are 50 subjects and each subject responds to 10 items. These 10 items are binary responses: 1 indicates correct and 0 indicates incorrect.
The following DATA step creates the SAS data set IrtBinary
:
data IrtBinary; input item1-item10 @@; datalines; 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 ... more lines ... 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 ;
The following statements fit an IRT model:
proc irt data=IrtBinary; var item1-item10; run;
The ODS GRAPHICS ON statement invokes the ODS Graphics environment and displays the plots, such as the item characteristic curve plot. For more information about ODS, see Chapter 21: Statistical Graphics Using ODS.
The PROC IRT statement invokes the procedure, and the DATA= option specifies the input data set IrtBinary
. The VAR statement names the variables to be used in the model. As you can see from the syntax in this example, fitting a IRT model
can be very simple when you use the default settings. These default settings are chosen to reflect the common setups in practice.
Some of the important default settings follow:
The number of factors is 1.
The two-parameter model is assumed for binary variables, and the graded response model is assumed for ordinal variables.
The link function is logistic link.
The estimation method is based on marginal likelihood.
The optimization method is the quasi-Newton algorithm.
The quadrature method is adaptive Gauss-Hermite quadrature, in which the number of quadrature points per dimension is determined adaptively.
As a result, the preceding statements fit two-parameter logistic (2PL) models for all the variables that are listed in the VAR statement.
The first table that PROC IRT produces is the “Modeling Information” table, as shown in Figure 51.1. This table displays basic information about the analysis, such as the name of the input data set, link function, number of items and factors, number of observations, and estimation method.
Figure 51.1: Model Information
Modeling Information | |
---|---|
Data Set | WORK.IRTBINARY |
Link Function | Logit |
Response Model | Graded Response Model |
Number of Items | 10 |
Number of Factors | 1 |
Number of Observations Read | 100 |
Number of Observations Used | 100 |
Estimation Method | Marginal Maximum Likelihood |
The “Item Information” table, shown in Figure 51.2, is displayed by default and can be used to check the item-level information. In this case, all 10 variables have two levels, and the raw values for these two levels are 0 and 1, respectively.
Figure 51.2: Item Information
Item Information | ||
---|---|---|
Item | Levels | Values |
item1 | 2 | 0 1 |
item2 | 2 | 0 1 |
item3 | 2 | 0 1 |
item4 | 2 | 0 1 |
item5 | 2 | 0 1 |
item6 | 2 | 0 1 |
item7 | 2 | 0 1 |
item8 | 2 | 0 1 |
item9 | 2 | 0 1 |
item10 | 2 | 0 1 |
The eigenvalues of polychoric correlations are also computed by default and are shown in Figure 51.3. You can use the information from these eigenvalues to assess a reasonable range for the number of factors.
Figure 51.3: Eigenvalues of Polychoric Correlation
Eigenvalues of the Polychoric Correlation Matrix | ||||
---|---|---|---|---|
Eigenvalue | Difference | Proportion | Cumulative | |
1 | 4.71105177 | 3.51149453 | 0.4711 | 0.4711 |
2 | 1.19955723 | 0.14183502 | 0.1200 | 0.5911 |
3 | 1.05772221 | 0.26577735 | 0.1058 | 0.6968 |
4 | 0.79194486 | 0.07204549 | 0.0792 | 0.7760 |
5 | 0.71989938 | 0.17782491 | 0.0720 | 0.8480 |
6 | 0.54207446 | 0.12713664 | 0.0542 | 0.9022 |
7 | 0.41493782 | 0.10631770 | 0.0415 | 0.9437 |
8 | 0.30862012 | 0.12256183 | 0.0309 | 0.9746 |
9 | 0.18605829 | 0.11792444 | 0.0186 | 0.9932 |
10 | 0.06813385 | 0.0068 | 1.0000 |
Next, the “Optimization Information” table, shown in Figure 51.4, lists the optimization technique, numeric quadrature method, and number of quadrature points per dimension.
Figure 51.4: Optimization Information
Optimization Information | |
---|---|
Optimization Technique | Quasi-Newton |
Likelihood Approximation | Adaptive Gauss-Hermite Quadrature |
Number of Quadrature Points | 19 |
Number of Free Parameters | 20 |
Because the estimation of IRT models can be slow, the “Iteration History” table, shown in Figure 51.5, is also included by default. It is updated after each iteration. For each iteration, the table displays current iteration number, number of function evaluations, objective function value, change of object function value, and maximum value of gradients. You can use this information to monitor the estimation status of the model. You can turn off the display of the “Iteration History” table by specifying the NOITPRINT option in the PROC IRT statement.
Figure 51.5: Iteration History
Iteration History | ||||
---|---|---|---|---|
Iteration | Evaluations | Objective Function |
Change | Max Gradient |
0 | 2 | 5.53317407 | 5.53317407 | 0.055021 |
1 | 4 | 5.43362050 | -0.09955356 | 0.017514 |
2 | 6 | 5.41516532 | -0.01845518 | 0.017074 |
3 | 8 | 5.40358346 | -0.01158186 | 0.007844 |
4 | 10 | 5.40165205 | -0.00193141 | 0.007266 |
5 | 12 | 5.40105183 | -0.00060022 | 0.003494 |
6 | 15 | 5.40087699 | -0.00017484 | 0.001484 |
7 | 18 | 5.40081315 | -0.00006385 | 0.000862 |
8 | 21 | 5.40078736 | -0.00002579 | 0.000806 |
9 | 24 | 5.40077714 | -0.00001021 | 0.000441 |
10 | 27 | 5.40077058 | -0.00000656 | 0.000141 |
11 | 30 | 5.40076985 | -0.00000073 | 0.000157 |
12 | 32 | 5.40076956 | -0.00000030 | 0.000202 |
13 | 34 | 5.40076913 | -0.00000042 | 0.000033 |
14 | 37 | 5.40076910 | -0.00000003 | 0.000024 |
15 | 40 | 5.40076909 | -0.00000001 | 0.00001 |
Following the “Iteration History” table is the convergence status table, shown in Figure 51.6. It shows whether the optimization algorithm converges successfully or not.
Next is the “Model Fit Statistics” table, shown in Figure 51.7, which include the log likelihood, Akaike’s information criterion (AIC), Bayesian information criterion (BIC), Pearson’s chi-square, and likelihood ratio.
Figure 51.7: Fit Statistics
Model Fit Statistics | |
---|---|
Log Likelihood | -540.0769091 |
AIC (Smaller is Better) | 1120.1538182 |
BIC (Smaller is Better) | 1172.257222 |
Likelihood Ratio | 300.3475528 |
Finally, the “Item Parameter Estimates” table, shown in Figure 51.8, includes parameter estimates, standard errors, and p-values. Parameters are organized and displayed within each item. The items are listed in the order of their appearance in the modeling statements.
Figure 51.8: Parameter Estimates
Item Parameter Estimates | ||||
---|---|---|---|---|
Item | Parameter | Estimate | Standard Error |
Pr > |t| |
item1 | Threshold | -1.92246 | 0.53399 | 0.0002 |
Slope | 2.22071 | 0.68479 | 0.0006 | |
item2 | Threshold | -2.37769 | 0.64464 | 0.0001 |
Slope | 2.33987 | 0.76209 | 0.0011 | |
item3 | Threshold | -1.99304 | 0.54014 | 0.0001 |
Slope | 2.18731 | 0.68382 | 0.0007 | |
item4 | Threshold | -1.73061 | 0.45433 | <.0001 |
Slope | 1.87329 | 0.57549 | 0.0006 | |
item5 | Threshold | -1.46288 | 0.35236 | <.0001 |
Slope | 1.33942 | 0.42772 | 0.0009 | |
item6 | Threshold | -0.57445 | 0.26843 | 0.0162 |
Slope | 1.17667 | 0.37105 | 0.0008 | |
item7 | Threshold | -0.58404 | 0.25017 | 0.0098 |
Slope | 0.94542 | 0.32647 | 0.0019 | |
item8 | Threshold | -0.48604 | 0.24737 | 0.0247 |
Slope | 0.95660 | 0.32981 | 0.0019 | |
item9 | Threshold | -0.45940 | 0.25892 | 0.0380 |
Slope | 1.11730 | 0.35663 | 0.0009 | |
item10 | Threshold | -0.65574 | 0.26191 | 0.0061 |
Slope | 1.05212 | 0.34962 | 0.0013 |