The IRT Procedure

Example 53.1 Unidimensional IRT Models

This example shows you the features that PROC IRT provides for unidimensional analysis. The data set comes from the 1978 Quality of American Life Survey. The survey was administered to a sample of all U.S. residents aged 18 years and older in 1978. In this survey, subjects were asked to rate their satisfaction with many different aspects of their lives. This example selects eight items. These items are designed to measure people’s satisfaction in the following areas on a seven-point scale: community, neighborhood, dwelling unit, life in the United States, amount of education received, own health, job, and how spare time is spent. For illustration purposes, the first five items are dichotomized and the last three items are collapsed into three levels.

The following DATA step creates the data set IrtUni.

data IrtUni;
   input item1-item8 @@;
   datalines;
1 0 0 0 1 1 2 1 1 1 1 1 1 3 3 3 0 1 0 0 1 1 1 1 1 0 0 1 0 1 2 3 0 0 0
0 0 1 1 1 1 0 0 1 0 1 3 3 0 0 0 0 0 1 1 3 0 0 1 0 0 1 2 2 0 1 0 0 1 1

   ... more lines ...   

3 3 0 1 0 0 1 2 2 1 
; 

Because all the items are designed to measure subjects’ satisfaction in different aspects of their lives, it is reasonable to start with a unidimensional IRT model. The following statements fit such a model by using several user-specified options:

ods graphics on;
proc irt data=IrtUni link=probit pinitial itemfit plots=ICC;
   var item1-item8;
   model item1-item4/resfunc=twop, item5-item8/resfunc=graded;
run;
ods graphics off;

The ODS GRAPHICS ON statement invokes the ODS Graphics environment and displays the plots, such as the item characteristic curve plot. For more information about ODS Graphics, see Chapter 21: Statistical Graphics Using ODS.

The first option is the LINK= option, which specifies that the link function be the probit link. Next, you request initial parameter estimates by using the PINITIAL option. Item fit statistics are displayed using the ITEMFIT option. In the PROC IRT statement, you can use the PLOTS option to request different plots. In this example, you request item characteristic curves by using the PLOTS=ICC option.

In this example, you use the MODEL statement to specify different response models for different items. The specifications in the MODEL statement suggest that the first four items, item1 to item4, are fitted using the two-parameter model, whereas the last four items, item5 to item8, are fitted using the graded response model.

Output 53.1.1 displays two tables. From the "Modeling Information" table, you can observe that the link function has changed from the default LOGIT link to the specified PROBIT link. The "Item Information" table shows that item1 to item5 each have two levels and item6 to item8 each have three levels. The last column shows the raw values of these different levels.

Output 53.1.1: Basic Information

The IRT Procedure

Modeling Information
Data Set WORK.IRTUNI
Link Function Probit
Number of Items 8
Number of Factors 1
Number of Observations Read 500
Number of Observations Used 500
Estimation Method Marginal Maximum Likelihood

Item Information
Response
Model
Item Levels Values
TwoP item1 2 0 1
  item2 2 0 1
  item3 2 0 1
  item4 2 0 1
Graded item5 2 0 1
  item6 3 1 2 3
  item7 3 1 2 3
  item8 3 1 2 3



PROC IRT produces the "Eigenvalues of the Polychoric Correlation Matrix" table in Output 53.1.2 by default. You can use these eigenvalues to assess the dimension of latent factors. For this example, the fact that only the first eigenvalue is greater than 1 suggests that a one-factor model for the items is reasonable.

Output 53.1.2: Eigenvalues of Polychoric Correlations

The IRT Procedure

Eigenvalues of the Polychoric Correlation Matrix
  Eigenvalue Difference Proportion Cumulative
1 3.11870486 2.12497677 0.3898 0.3898
2 0.99372809 0.10025986 0.1242 0.5141
3 0.89346823 0.03116998 0.1117 0.6257
4 0.86229826 0.10670185 0.1078 0.7335
5 0.75559640 0.17795713 0.0944 0.8280
6 0.57763928 0.10080017 0.0722 0.9002
7 0.47683911 0.15511333 0.0596 0.9598
8 0.32172578   0.0402 1.0000



The PINITIAL option in the PROC IRT statement displays the "Initial Item Parameter Estimates" table, shown in Output 53.1.3.

Output 53.1.3: Initial Parameter Estimates

The IRT Procedure

Initial Item Parameter Estimates
Response
Model
Item Parameter Estimate
TwoP item1 Difficulty 0.26428
    Slope 1.05346
  item2 Difficulty 0.58640
    Slope 0.93973
  item3 Difficulty 0.44607
    Slope 0.82826
  item4 Difficulty 0.50157
    Slope 0.50906
Graded item5 Threshold 1 -0.86792
    Slope 0.41380
  item6 Threshold 1 -0.59512
    Threshold 2 2.00678
    Slope 0.36063
  item7 Threshold 1 -0.90743
    Threshold 2 0.69335
    Slope 0.64191
  item8 Threshold 1 -1.18209
    Threshold 2 0.26959
    Slope 0.67591



Output 53.1.4 includes tables that are related to the optimization. The "Optimization Information" table shows that the log likelihood is approximated by using seven adaptive Gauss-Hermite quadrature points and then maximized by using the quasi-Newton algorithm. The number of free parameters in this example is 19. The "Iteration History" table shows the number of function evaluations, the objective function (–$\log $ likelihood divided by number of subjects) values, the objective function change, and the maximum gradient for each iteration. This information is very useful in monitoring the optimization status. Output 53.1.4 shows the convergence status at the bottom. The optimization converges according to the GCONV=0.00000001 criterion.

Output 53.1.4: Optimization Information

The IRT Procedure

Optimization Information
Optimization Technique Quasi-Newton
Likelihood Approximation Adaptive Gauss-Hermite Quadrature
Number of Quadrature Points 9
Number of Free Parameters 19

Iteration History
Iteration Evaluations Objective
Function
Change Max
Gradient
0 2 6.19423730 6.19423730 0.015501
1 5 6.19269781 -0.00153950 0.005787
2 8 6.19256587 -0.00013194 0.003813
3 10 6.19249856 -0.00006732 0.003278
4 12 6.19245371 -0.00004485 0.004635
5 15 6.19243639 -0.00001732 0.001284
6 18 6.19242942 -0.00000696 0.00049
7 21 6.19242884 -0.00000058 0.000191
8 24 6.19242871 -0.00000013 0.000104
9 27 6.19242867 -0.00000004 0.000051
10 30 6.19242867 -0.00000001 0.000011

Convergence criterion (GCONV=.000000010) satisfied.



Output 53.1.5 displays the model fit and item fit statistics. Note that the item fit statistics apply only to the binary items. That is why these fit statistics are missing for item6 to item8.

Output 53.1.5: Fit Statistics

The IRT Procedure

Model Fit Statistics
Log Likelihood -3096.214333
AIC (Smaller is Better) 6230.4286657
BIC (Smaller is Better) 6310.5062196

Item Fit Statistics
Response
Model
Item DF Pearson
Chi-Square
Pr > P ChiSq LR
Chi-Square
Pr > LR ChiSq
TwoP item1 8 34.16545 <.0001 49.39743 <.0001
  item2 8 30.34646 0.0002 37.52865 <.0001
  item3 8 27.54546 0.0006 36.34505 <.0001
  item4 8 22.75981 0.0037 26.13409 0.0010
Graded item5 8 18.32369 0.0189 19.68488 0.0116
  item6 . . . . .
  item7 . . . . .
  item8 . . . . .



The last table for this example is the "Item Parameter Estimates " table in Output 53.1.6. This table contains parameter estimates, standard errors, and p-values. These p-values suggest that all the parameters are significantly different from zero.

Output 53.1.6: Parameter Estimates

The IRT Procedure

Item Parameter Estimates
Response
Model
Item Parameter Estimate Standard
Error
Pr > |t|
TwoP item1 Difficulty 0.27342 0.08301 0.0005
    Slope 0.98381 0.14145 <.0001
  item2 Difficulty 0.60268 0.10047 <.0001
    Slope 0.90009 0.13112 <.0001
  item3 Difficulty 0.46110 0.10061 <.0001
    Slope 0.79521 0.11392 <.0001
  item4 Difficulty 0.50686 0.14410 0.0002
    Slope 0.50431 0.08567 <.0001
Graded item5 Threshold -0.79750 0.18708 <.0001
    Slope 0.45385 0.08238 <.0001
  item6 Threshold 1 -0.59139 0.19858 0.0014
    Threshold 2 2.02708 0.39593 <.0001
    Slope 0.35770 0.06777 <.0001
  item7 Threshold 1 -0.82130 0.12753 <.0001
    Threshold 2 0.64441 0.11431 <.0001
    Slope 0.72674 0.09312 <.0001
  item8 Threshold 1 -1.08124 0.14132 <.0001
    Threshold 2 0.25166 0.09535 0.0042
    Slope 0.76383 0.09753 <.0001



Item characteristic curves (ICC) are also produced in this example. By default, these ICC plots are displayed in panels. To display an individual ICC plot for each item, use the UNPACK suboption in the PLOTS= option in the PROC IRT statement.

Output 53.1.7: ICC Plots

ICC Plots


Now, suppose your research hypothesis includes some equality constraints on the model parameters—for example, the slopes for the first four items are equal. Such equality constraints can be specified easily by using the EQUALITY statement. In the following example, the slope parameters of the first four items are equal:

proc irt data=IrtUni;
   var item1-item8;
   model item1-item4/resfunc=twop, item5-item8/resfunc=graded;
   equality item1-item4/parm=[slope];
run;

To estimate the factor score for each subject and add these scores to the original data set, you can use the OUT= option in the PROC IRT statement. PROC IRT provides three factor score estimation methods: maximum likelihood (ML), maximum a posteriori (MAP), and expected a posteriori (EAP). You can choose an estimation method by using the SCOREMETHOD= option in the PROC IRT statement. The default method is maximum a posteriori. In the following, factor scores along with the original data are saved to a SAS data set called IrtUniFscore:

proc irt data=IrtUni out=IrtUniFscore;
   var item1-item8;
   model item1-item4/resfunc=twop,
         item5-item8/resfunc=graded;
   equality item1-item4/parm=[slope];
run;

Sometimes you might find it useful to sort the items based on the estimated difficulty or slope parameters. You can do this by outputting the ODS tables for the estimates into data sets and then sorting the items by using PROC SORT. A simulated data set is used to show the steps.

The following DATA step creates the data set IrtSimu:

data IrtSimu;
   input item1-item25 @@;
   datalines;
1 1 1 0 1 1 0 0 1 1 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 1 1 0 0
0 0 0 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 1 0 1 1 0 1 1 0 0 1 1 0
1 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1

   ... more lines ...   

1 1 0 1 1 1 1 1 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 1 1 1 1 
; 

First, you build the model and output the parameter estimates table into a SAS data set by using the ODS OUTPUT statement:

proc irt data=IrtSimu link=probit;
   var item1-item25;
   ods output ParameterEstimates=ParmEst;
run;

Output 53.1.8 shows the "Item Parameter Estimates" table. Notice that the difficulty and slope parameters are in the same column. The reason for this is to avoid having an extremely wide table when each item has a lot of parameters.

Output 53.1.8: Basic Information

The IRT Procedure

Item Parameter Estimates
Item Parameter Estimate Standard
Error
Pr > |t|
item1 Difficulty -1.32543 0.09769 <.0001
  Slope 1.44263 0.16091 <.0001
item2 Difficulty -0.99702 0.07431 <.0001
  Slope 1.82221 0.19006 <.0001
item3 Difficulty -1.24965 0.08962 <.0001
  Slope 1.58762 0.17495 <.0001
item4 Difficulty -1.09577 0.07727 <.0001
  Slope 1.86821 0.20452 <.0001
item5 Difficulty -1.07857 0.07784 <.0001
  Slope 1.78392 0.19080 <.0001
item6 Difficulty -0.95057 0.09377 <.0001
  Slope 1.04203 0.10284 <.0001
item7 Difficulty -0.65085 0.06918 <.0001
  Slope 1.45413 0.13473 <.0001
item8 Difficulty -0.76372 0.07583 <.0001
  Slope 1.30444 0.12230 <.0001
item9 Difficulty -0.72281 0.07030 <.0001
  Slope 1.50740 0.14242 <.0001
item10 Difficulty -0.50750 0.06091 <.0001
  Slope 1.82387 0.17155 <.0001
item11 Difficulty -0.01360 0.06433 0.4163
  Slope 1.26234 0.11287 <.0001
item12 Difficulty 0.04013 0.05537 0.2343
  Slope 2.02257 0.20099 <.0001
item13 Difficulty 0.16034 0.06845 0.0096
  Slope 1.13120 0.10199 <.0001
item14 Difficulty 0.01069 0.05625 0.4246
  Slope 1.89082 0.18128 <.0001
item15 Difficulty 0.07152 0.07328 0.1645
  Slope 0.96388 0.09050 <.0001
item16 Difficulty -0.81412 0.07905 <.0001
  Slope 1.25375 0.11885 <.0001
item17 Difficulty -0.92046 0.09289 <.0001
  Slope 1.02930 0.10125 <.0001
item18 Difficulty -0.59407 0.06606 <.0001
  Slope 1.58443 0.14868 <.0001
item19 Difficulty -0.97598 0.09743 <.0001
  Slope 0.97979 0.09761 <.0001
item20 Difficulty -0.48859 0.05960 <.0001
  Slope 1.95710 0.18827 <.0001
item21 Difficulty -0.60655 0.06819 <.0001
  Slope 1.45324 0.13427 <.0001
item22 Difficulty -0.51263 0.06188 <.0001
  Slope 1.74472 0.16251 <.0001
item23 Difficulty -0.90928 0.08451 <.0001
  Slope 1.20281 0.11623 <.0001
item24 Difficulty -0.56515 0.06295 <.0001
  Slope 1.74437 0.16384 <.0001
item25 Difficulty -0.58905 0.06718 <.0001
  Slope 1.48957 0.13785 <.0001



Output 53.1.9: The Difficulty Parameter SAS Data Set

Obs Item Difficulty
1 item1 -1.32543
2 item2 -0.99702
3 item3 -1.24965
4 item4 -1.09577
5 item5 -1.07857
6 item6 -0.95057
7 item7 -0.65085
8 item8 -0.76372
9 item9 -0.72281
10 item10 -0.50750
11 item11 -0.01360
12 item12 0.04013
13 item13 0.16034
14 item14 0.01069
15 item15 0.07152
16 item16 -0.81412
17 item17 -0.92046
18 item18 -0.59407
19 item19 -0.97598
20 item20 -0.48859
21 item21 -0.60655
22 item22 -0.51263
23 item23 -0.90928
24 item24 -0.56515
25 item25 -0.58905



Then you save the estimates of slopes and difficulties in the data set ParmEst and create two separate data sets to store the difficulty and slope parameters:

data Diffs(keep=Item Difficulty);
   set ParmEst;
   Difficulty = Estimate;
   if (Parameter = "Difficulty") then output;
run;
proc print data=Diffs;
run;
data Slopes(keep=Item Slope);
   set ParmEst;
   Slope = Estimate;
   if (Parameter = "Slope") then output;
run;
proc print data=Slopes;
run;

The two SAS data sets are shown in Output 53.1.9 and Output 53.1.10.

Output 53.1.10: The Slope Parameter SAS Data Set

Obs Item Slope
1 item1 1.44263
2 item2 1.82221
3 item3 1.58762
4 item4 1.86821
5 item5 1.78392
6 item6 1.04203
7 item7 1.45413
8 item8 1.30444
9 item9 1.50740
10 item10 1.82387
11 item11 1.26234
12 item12 2.02257
13 item13 1.13120
14 item14 1.89082
15 item15 0.96388
16 item16 1.25375
17 item17 1.02930
18 item18 1.58443
19 item19 0.97979
20 item20 1.95710
21 item21 1.45324
22 item22 1.74472
23 item23 1.20281
24 item24 1.74437
25 item25 1.48957



Now you can use PROC SORT to sort the items by either difficulty or slope as follows:

proc sort data=Diffs;
   by Difficulty;
run;
proc print data=Diffs;
run;
proc sort data=Slopes;
   by Slope;
run;
proc print data=Slopes;
run;

Output 53.1.11 and Output 53.1.12 show the sorted data sets.

Output 53.1.11: Items Sorted by Difficulty

Obs Item Difficulty
1 item1 -1.32543
2 item3 -1.24965
3 item4 -1.09577
4 item5 -1.07857
5 item2 -0.99702
6 item19 -0.97598
7 item6 -0.95057
8 item17 -0.92046
9 item23 -0.90928
10 item16 -0.81412
11 item8 -0.76372
12 item9 -0.72281
13 item7 -0.65085
14 item21 -0.60655
15 item18 -0.59407
16 item25 -0.58905
17 item24 -0.56515
18 item22 -0.51263
19 item10 -0.50750
20 item20 -0.48859
21 item11 -0.01360
22 item14 0.01069
23 item12 0.04013
24 item15 0.07152
25 item13 0.16034



Output 53.1.12: Items Sorted by Slope

Obs Item Slope
1 item15 0.96388
2 item19 0.97979
3 item17 1.02930
4 item6 1.04203
5 item13 1.13120
6 item23 1.20281
7 item16 1.25375
8 item11 1.26234
9 item8 1.30444
10 item1 1.44263
11 item21 1.45324
12 item7 1.45413
13 item25 1.48957
14 item9 1.50740
15 item18 1.58443
16 item3 1.58762
17 item24 1.74437
18 item22 1.74472
19 item5 1.78392
20 item2 1.82221
21 item10 1.82387
22 item4 1.86821
23 item14 1.89082
24 item20 1.95710
25 item12 2.02257



Notice that the sorting does not work correctly if any of the items have more than one threshold (ordinal response) or slope (multidimensional model).

Now, suppose you want to group the items into subgroups based on their difficulty parameters and then sort the items in each subgroup by their slope parameters. First, you need to merge the two data sets, Diffs and Slopes, into one data set. Then, you add another variable, called DiffLevel, to indicate the subgroups. The following statements show these steps:

proc sort data=Slopes;
   by Item;
run;
proc sort data=Diffs;
   by Item;
run;
data ItemEst;
   merge Diffs Slopes;
   by Item;
   if Difficulty < -1.0 then DiffLevel = 1;
   else if Difficulty < 0 then DiffLevel = 2;
   else if Difficulty < 1 then DiffLevel = 3;
   else DiffLevel = 4;
run;
proc print data=ItemEst;
run;

Output 53.1.13 shows the merged data set.

Output 53.1.13: The Merged SAS Data Set

Obs Item Difficulty Slope DiffLevel
1 item1 -1.32543 1.44263 1
2 item10 -0.50750 1.82387 2
3 item11 -0.01360 1.26234 2
4 item12 0.04013 2.02257 3
5 item13 0.16034 1.13120 3
6 item14 0.01069 1.89082 3
7 item15 0.07152 0.96388 3
8 item16 -0.81412 1.25375 2
9 item17 -0.92046 1.02930 2
10 item18 -0.59407 1.58443 2
11 item19 -0.97598 0.97979 2
12 item2 -0.99702 1.82221 2
13 item20 -0.48859 1.95710 2
14 item21 -0.60655 1.45324 2
15 item22 -0.51263 1.74472 2
16 item23 -0.90928 1.20281 2
17 item24 -0.56515 1.74437 2
18 item25 -0.58905 1.48957 2
19 item3 -1.24965 1.58762 1
20 item4 -1.09577 1.86821 1
21 item5 -1.07857 1.78392 1
22 item6 -0.95057 1.04203 2
23 item7 -0.65085 1.45413 2
24 item8 -0.76372 1.30444 2
25 item9 -0.72281 1.50740 2



Then, you can sort the items by slope within each difficulty group as follows:

proc sort data=ItemEst;
   by difflevel slope;
run;
proc print data=ItemEst;
run;

Output 53.1.14 shows the data set after sorting.

Output 53.1.14: Item Sorted by Slope within Each Difficulty Group

Obs Item Difficulty Slope DiffLevel
1 item1 -1.32543 1.44263 1
2 item3 -1.24965 1.58762 1
3 item5 -1.07857 1.78392 1
4 item4 -1.09577 1.86821 1
5 item19 -0.97598 0.97979 2
6 item17 -0.92046 1.02930 2
7 item6 -0.95057 1.04203 2
8 item23 -0.90928 1.20281 2
9 item16 -0.81412 1.25375 2
10 item11 -0.01360 1.26234 2
11 item8 -0.76372 1.30444 2
12 item21 -0.60655 1.45324 2
13 item7 -0.65085 1.45413 2
14 item25 -0.58905 1.48957 2
15 item9 -0.72281 1.50740 2
16 item18 -0.59407 1.58443 2
17 item24 -0.56515 1.74437 2
18 item22 -0.51263 1.74472 2
19 item2 -0.99702 1.82221 2
20 item10 -0.50750 1.82387 2
21 item20 -0.48859 1.95710 2
22 item15 0.07152 0.96388 3
23 item13 0.16034 1.13120 3
24 item14 0.01069 1.89082 3
25 item12 0.04013 2.02257 3