The IRT Procedure

Getting Started: IRT Procedure

This example shows how you can use all default settings in PROC IRT to fit an item response model. In this example, there are 50 subjects and each subject responds to 10 items. These 10 items have binary responses: 1 indicates correct and 0 indicates incorrect.

The following DATA step creates the SAS data set IrtBinary:

data IrtBinary;
   input item1-item10 @@;
   datalines;
1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1
0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1

   ... more lines ...   

1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 
; 

The following statements fit an IRT model:

proc irt data=IrtBinary;
   var item1-item10;
run;

The PROC IRT statement invokes the procedure, and the DATA= option specifies the input data set IrtBinary. The VAR statement names the variables to be used in the model. As you can see from the syntax in this example, fitting a IRT model can be very simple when you use the default settings. These default settings are chosen to reflect setups that are common in practice. Some of the important default settings follow:

  • The number of factors is 1.

  • The two-parameter logistic model is assumed for binary variables, and the graded response model is assumed for ordinal variables.

  • The link function is logistic link.

  • The estimation method is based on marginal likelihood.

  • The optimization method is the quasi-Newton algorithm.

  • The quadrature method is adaptive Gauss-Hermite quadrature, in which the number of quadrature points per dimension is determined adaptively.

As a result, the preceding statements fit two-parameter logistic (2PL) models for all the variables that are listed in the VAR statement.

The first table that PROC IRT produces is the "Modeling Information" table, as shown in Figure 53.1. This table displays basic information about the analysis, such as the name of the input data set, the link function, the number of items and factors, the number of observations, and the estimation method. You can change the link function by using the LINK= option in the PROC IRT statement. You can change the response model for all the items by using the RESFUNC= option in the PROC IRT statement. You can specify different response functions or models for different set of variables by including a MODEL statement. If you want to do multidimensional exploratory analysis, you can simply change the number of factors by using the NFACTOR= option in the PROC IRT statement. For confirmatory analysis, you can use the FACTOR statement to specify the confirmatory factor pattern; the number of factors is implicitly defined by the number of distinctive factor names that you specify in the FACTOR statement.

Figure 53.1: Model Information

The IRT Procedure

Modeling Information
Data Set WORK.IRTBINARY
Link Function Logit
Response Model Two Parameter Model
Number of Items 10
Number of Factors 1
Number of Observations Read 100
Number of Observations Used 100
Estimation Method Marginal Maximum Likelihood



The "Item Information" table, shown in Figure 53.2, is displayed by default and can be used to check the item-level information. In this case, each of the 10 variables has two levels, and the raw values for these two levels are 0 and 1, respectively.

Figure 53.2: Item Information

Item Information
Item Levels Values
item1 2 0 1
item2 2 0 1
item3 2 0 1
item4 2 0 1
item5 2 0 1
item6 2 0 1
item7 2 0 1
item8 2 0 1
item9 2 0 1
item10 2 0 1



The eigenvalues of polychoric correlations are also computed by default and are shown in Figure 53.3. You can use the information from these eigenvalues to assess a reasonable range for the number of factors. For this example, you can observe that the first eigenvalue accounts for almost $50\% $ of the variance, which suggests that there is only one dominant eigenvalue and that a unidimensional model is reasonable for this example. To produce the polychoric correlation table, you specify the POLYCHORIC option in the PROC IRT statement.

Figure 53.3: Eigenvalues of Polychoric Correlation

Eigenvalues of the Polychoric Correlation Matrix
  Eigenvalue Difference Proportion Cumulative
1 4.71105177 3.51149453 0.4711 0.4711
2 1.19955723 0.14183502 0.1200 0.5911
3 1.05772221 0.26577735 0.1058 0.6968
4 0.79194486 0.07204549 0.0792 0.7760
5 0.71989938 0.17782491 0.0720 0.8480
6 0.54207446 0.12713664 0.0542 0.9022
7 0.41493782 0.10631770 0.0415 0.9437
8 0.30862012 0.12256183 0.0309 0.9746
9 0.18605829 0.11792444 0.0186 0.9932
10 0.06813385   0.0068 1.0000



Next, the "Optimization Information" table, shown in Figure 53.4, lists the optimization technique, the numeric quadrature method, and the number of quadrature points per dimension. If you want to use the expectation-maximization (EM) technique, specify TECHNIQUE=EM in the PROC IRT statement. If you specify the NOAD option in the PROC IRT statement, PROC IRT uses the nonadaptive Gauss-Hermite quadrature to approximate the likelihood. You can change the number of quadrature points by specifying the QPOINTS= option in the PROC IRT statement.

Figure 53.4: Optimization Information

Optimization Information
Optimization Technique Quasi-Newton
Likelihood Approximation Adaptive Gauss-Hermite Quadrature
Number of Quadrature Points 11
Number of Free Parameters 20



Figure 53.5 shows the "Iteration History" table. For each iteration, the table displays the current iteration number, number of function evaluations, objective function value, change of object function value, and maximum value of gradients. You can use this information to monitor the estimation status of the model. You can turn off the display of the "Iteration History" table by specifying the NOITPRINT option in the PROC IRT statement.

Following the "Iteration History" table is the convergence status table, shown in Figure 53.6. It shows whether the optimization algorithm converges successfully or not. You should make sure that the optimization converges successfully before you try to interpret the estimation results.

Figure 53.5: Iteration History

Iteration History
Iteration Evaluations Objective
Function
Change Max
Gradient
0 2 5.53319820 5.53319820 0.055036
1 4 5.43364722 -0.09955098 0.017483
2 6 5.41529856 -0.01834866 0.017061
3 8 5.40395721 -0.01134135 0.007767
4 10 5.40207771 -0.00187951 0.007106
5 12 5.40147004 -0.00060767 0.003499
6 15 5.40129723 -0.00017281 0.001441
7 18 5.40123492 -0.00006231 0.000867
8 21 5.40120879 -0.00002613 0.000798
9 24 5.40119848 -0.00001031 0.000448
10 27 5.40119212 -0.00000636 0.000145
11 30 5.40119134 -0.00000077 0.000159
12 33 5.40119081 -0.00000054 0.000081
13 36 5.40119060 -0.00000021 0.000045
14 39 5.40119057 -0.00000003 0.00002
15 42 5.40119057 -0.00000001 8.572E-6



Figure 53.6: Convergence Status

Convergence criterion (GCONV=.000000010) satisfied.



Next is the "Model Fit Statistics" table, shown in Figure 53.7, which includes the log likelihood, Akaike’s information criterion (AIC), and the Bayesian information criterion (BIC). If all the response patterns are observed, Pearson’s chi-square and likelihood ratio chi-square statistics are also included in this table. Because some of the response patterns in this example are not observed, the Pearson’s chi-square and likelihood ratio chi-square statistics are not included in the table.

Figure 53.7: Fit Statistics

Model Fit Statistics
Log Likelihood -540.1190565
AIC (Smaller is Better) 1120.238113
BIC (Smaller is Better) 1172.3415168



Finally, the "Item Parameter Estimates" table, shown in Figure 53.8, includes parameter estimates, standard errors, and p-values. Parameters are organized and displayed within each item. The items are listed in the order of their appearance in the modeling statements. For each item, there are two parameters: difficulty and slope. Difficulty parameters measure the difficulties of the items. As the value of the difficulty parameter increases, the item becomes more difficult. In Figure 53.8, you can observe that all the difficulty parameters are less than 0, which suggests that all the items in this example are relatively easy. The slope parameter values for this example range from 0.94 to 2.33, suggesting that all the items are adequate measures of the latent trait.

Figure 53.8: Parameter Estimates

Item Parameter Estimates
Item Parameter Estimate Standard
Error
Pr > |t|
item1 Difficulty -0.87121 0.20083 <.0001
  Slope 2.20624 0.67941 0.0006
item2 Difficulty -1.02199 0.21318 <.0001
  Slope 2.32649 0.76037 0.0011
item3 Difficulty -0.91668 0.20857 <.0001
  Slope 2.17452 0.68330 0.0007
item4 Difficulty -0.92919 0.22707 <.0001
  Slope 1.86354 0.57110 0.0006
item5 Difficulty -1.09791 0.30594 0.0002
  Slope 1.33344 0.42511 0.0009
item6 Difficulty -0.49151 0.24385 0.0219
  Slope 1.17140 0.36940 0.0008
item7 Difficulty -0.62129 0.30189 0.0198
  Slope 0.94209 0.32563 0.0019
item8 Difficulty -0.51111 0.28477 0.0363
  Slope 0.95367 0.32914 0.0019
item9 Difficulty -0.41404 0.24477 0.0454
  Slope 1.11314 0.35530 0.0009
item10 Difficulty -0.62670 0.27982 0.0126
  Slope 1.04867 0.34845 0.0013