Section 9.4 of Generalized Latent Variable Modeling

Generalized Latent Variable Modeling by Skrondal and Rabe-Hesketh
Section 9.4: Arithmetic Reasoning

One and two-parameter logistic item response models and MIMIC models

The data set used in this section is mislevy.dat. Below we assume that this has been saved in the current directory.

The do-file is mislevy.do.

The programs we use are gllamm and gllapred. You can find the programs and download them by issuing the command findit gllamm and findit gllapred. For more information see http://www.gllamm.org.

Read and prepare the data

insheet using mislevy.dat, clear
list, clean
       y1   y2   y3   y4   cwm   cwf   cbm   cbf  
  1.    0    0    0    0    23    20    27    29  
  2.    0    0    0    1     5     8     5     8  
  3.    0    0    1    0    12    14    15     7  
  4.    0    0    1    1     2     2     3     3  
  5.    0    1    0    0    16    20    16    14  
  6.    0    1    0    1     3     5     5     5  
  7.    0    1    1    0     6    11     4     6  
  8.    0    1    1    1     1     7     3     0  
  9.    1    0    0    0    22    23    15    14  
 10.    1    0    0    1     6     8    10    10  
 11.    1    0    1    0     7     9     8    11  
 12.    1    0    1    1    19     6     1     2  
 13.    1    1    0    0    21    18     7    19  
 14.    1    1    0    1    11    15     9     5  
 15.    1    1    1    0    23    20    10     8  
 16.    1    1    1    1    86    42     2     4

Stack variables cwm, cwf, cbm and cbf into a single frequency variable wt2 and create dummies w for white and m for male


gen i=_n
reshape long cw cb,i(i) j(male) string
replace i=_n
reshape long c, i(i) j(white) string
drop i

encode white, gen(w)
encode male, gen(m)
replace w=w-1
replace m=m-1
 
rename c wt2
list in 1/10, clean nolab

       white   male   y1   y2   y3   y4   wt2   w   m  
  1.       b      f    0    0    0    0    29   0   0  
  2.       w      f    0    0    0    0    20   1   0  
  3.       b      m    0    0    0    0    27   0   1  
  4.       w      m    0    0    0    0    23   1   1  
  5.       b      f    0    0    0    1     8   0   0  
  6.       w      f    0    0    0    1     8   1   0  
  7.       b      m    0    0    0    1     5   0   1  
  8.       w      m    0    0    0    1     5   1   1  
  9.       b      f    0    0    1    0     7   0   0  
 10.       w      f    0    0    1    0    14   1   0

Calculate tot, the sizes of the four groups defined by w and m


egen tot = sum(wt2), by(w m)

Stack responses y1 to y4 into a single vector and create variable item


gen patt=_n
reshape long y, i(patt) j(item)
list in 1/8, clean nolab 

       patt   item   white   male   y   wt2   w   m  
  1.      1      1       b      f   0    29   0   0  
  2.      1      2       b      f   0    29   0   0  
  3.      1      3       b      f   0    29   0   0  
  4.      1      4       b      f   0    29   0   0  
  5.      2      1       w      f   0    20   1   0  
  6.      2      2       w      f   0    20   1   0  
  7.      2      3       w      f   0    20   1   0  
  8.      2      4       w      f   0    20   1   0

Create dummy variables d1 to d4 for items 1 to 4

qui tab item, gen(d)

Estimate the one-parameter logistic IRT model (Table 9.5)


gllamm y d1 d2 d3 d4, i(patt) l(logit) f(binom) weight(wt) nocons adapt


number of level 1 units = 3104
number of level 2 units = 776
 
Condition Number = 1.6838733
 
gllamm model
 
log likelihood = -2004.9379
 
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          d1 |   .5775969   .0969974     5.95   0.000     .3874854    .7677083
          d2 |   .2382793   .0950415     2.51   0.012     .0520014    .4245572
          d3 |  -.2247582   .0949752    -2.37   0.018    -.4109062   -.0386102
          d4 |  -.5938583   .0971079    -6.12   0.000    -.7841863   -.4035303
------------------------------------------------------------------------------
 
 
Variances and covariances of random effects
------------------------------------------------------------------------------

 
***level 2 (patt)
 
    var(1): 1.6285398 (.20840709)
------------------------------------------------------------------------------

Plot the item characteristic curves (Figure 9.5)


matrix list e(b)

e(b)[1,5]
             y:          y:          y:          y:      patt1:
            d1          d2          d3          d4       _cons
y1   .57759686   .23827933  -.22475821  -.59385829   1.2761425

twoway (function y=1/(1+exp(-[y]d1 -x*[patt1]_cons)), range(-2.5 2.5))                   /*
*/     (function y=1/(1+exp(-[y]d2 -x*[patt1]_cons)), range(-2.5 2.5) clpatt(dot))       /*
*/     (function y=1/(1+exp(-[y]d3 -x*[patt1]_cons)), range(-2.5 2.5) clpatt(dash))      /*
*/     (function y=1/(1+exp(-[y]d4 -x*[patt1]_cons)), range(-2.5 2.5) clpatt(longdash)), /*
*/     legend( label(1 "Item 1") label(2 "Item 2") label(3 "Item 3") label(4 "Item 4") ) /*
*/     xtitle(Ability) ytitle(Probability of correct answer)

Estimate two-parameter logistic IRT model (Table 9.5)


eq load: d1-d4
gllamm y d1-d4, i(patt) eqs(load) l(logit) f(binom) weight(wt) nocons adapt nip(12)  

number of level 1 units = 3104
number of level 2 units = 776
 
Condition Number = 5.4532607
 
gllamm model
 
log likelihood = -2002.7391
 
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          d1 |   .6453275   .1206319     5.35   0.000     .4088932    .8817617
          d2 |   .2194106   .0890237     2.46   0.014     .0449274    .3938939
          d3 |  -.2156426   .0908816    -2.37   0.018    -.3937672    -.037518
          d4 |  -.6251801    .109345    -5.72   0.000    -.8394923   -.4108678
------------------------------------------------------------------------------
 
 
Variances and covariances of random effects
------------------------------------------------------------------------------

 
***level 2 (patt)
 
    var(1): 2.6007398 (.87041805)
 
    loadings for random effect 1
    d1: 1 (fixed)
    d2: .64650121 (.15490565)
    d3: .69484866 (.16864938)
    d4: .89729467 (.21828051)
 
------------------------------------------------------------------------------

The variance and factor loading estimates differ a little from those in Table 9.5.

Plot item characteristic curves (Figure 9.5)


matrix list e(b)

e(b)[1,8]
             y:          y:          y:          y:    pat1_1l:    pat1_1l:    pat1_1l:
            d1          d2          d3          d4          d2          d3          d4
y1   .64532747   .21941063  -.21564261  -.62518007   .64650121   .69484866   .89729467

        pat1_1:
            d1
y1   1.6126809

twoway (function y=1/(1+exp(-[y]d1 -x*[pat1_1]d1)), range(-2.5 2.5))                               /*
*/     (function y=1/(1+exp(-[y]d2 -x*[pat1_1]d1*[pat1_1l]d2)), range(-2.5 2.5) clpatt(dot))       /*
*/     (function y=1/(1+exp(-[y]d3 -x*[pat1_1]d1*[pat1_1l]d3)), range(-2.5 2.5) clpatt(dash))      /*
*/     (function y=1/(1+exp(-[y]d4 -x*[pat1_1]d1*[pat1_1l]d4)), range(-2.5 2.5) clpatt(longdash)), /*
*/     legend( label(1 "Item 1") label(2 "Item 2") label(3 "Item 3") label(4 "Item 4") )           /*
*/     xtitle(Ability) ytitle(Probability of correct answer)

Estimate two-parameter IRT model with non-zero mean ability, setting the item difficulty of item 1 to zero (Table 9.6)


gen cons=1
eq load: d1-d4
eq f1: cons

gllamm y d2-d4, i(patt) eqs(load) l(logit) f(binom) weight(wt) /*
   */ geqs(f1) nocons adapt nip(12)  

number of level 1 units = 3104
number of level 2 units = 776
 
Condition Number = 6.9163564
 
gllamm model
 
log likelihood = -2002.7391
 
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          d2 |   -.197791   .1174163    -1.68   0.092    -.4279228    .0323408
          d3 |  -.6640437   .1352743    -4.91   0.000    -.9291764    -.398911
          d4 |  -1.204219   .1846912    -6.52   0.000    -1.566207   -.8422309
------------------------------------------------------------------------------
 
 
Variances and covariances of random effects
------------------------------------------------------------------------------

 
***level 2 (patt)
 
    var(1): 2.6008217 (.87044376)
 
    loadings for random effect 1
    d1: 1 (fixed)
    d2: .64648882 (.15490305)
    d3: .69483564 (.1686465)
    d4: .89727244 (.21827359)
 
 
Regressions of latent variables on covariates
------------------------------------------------------------------------------

 
    random effect 1 has 1 covariates:
    cons: .64533407 (.12063317)
------------------------------------------------------------------------------

The estimates differ a little from those in Table 9.6.

Empirical Bayes predictions: EAP ability scores


gllapred IRT, fac

(means and standard deviations will be stored in IRTm1 IRTs1)

Estimate a MIMIC model where ability depends on sex (dummy f), race (dummy b) and their interaction (Table 9.6)


gen f=1-m
gen b=1-w
gen b_f = b*f
eq f1: cons f b b_f
matrix a=e(b)
gllamm y d2-d4, i(patt) eqs(load) l(logit) f(binom) weight(wt) /*
  */ geqs(f1) from(a) nocons adapt nip(12)  

number of level 1 units = 3104
number of level 2 units = 776
 
Condition Number = 10.407055
 
gllamm model
 
log likelihood = -1956.2333
 
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          d2 |  -.2114786   .1159544    -1.82   0.068    -.4387451     .015788
          d3 |  -.7145968   .1379683    -5.18   0.000    -.9850096   -.4441839
          d4 |  -1.159195   .1601593    -7.24   0.000    -1.473101   -.8452882
------------------------------------------------------------------------------
 
 
Variances and covariances of random effects
------------------------------------------------------------------------------

 
***level 2 (patt)
 
    var(1): 1.9689171 (.60516363)
 
    loadings for random effect 1
    d1: 1 (fixed)
    d2: .67658546 (.14380046)
    d3: .77264707 (.16674369)
    d4: .86240868 (.17342379)
 
 
Regressions of latent variables on covariates
------------------------------------------------------------------------------

 
    random effect 1 has 4 covariates:
    cons: 1.4345373 (.21553973)
    f: -.6200911 (.20526977)
    b: -1.6843558 (.31129702)
    b_f: .67057381 (.32116754)
------------------------------------------------------------------------------

The estimates differ a little from those in Table 9.6.

Empirical Bayes predictions: EAP ability scores


gllapred MIMIC, fac

(means and standard deviations will be stored in MIMICm1 MIMICs1)

Look at ability scores for each response and covariate pattern (Table 9.7)


drop d1-d4
reshape wide y, i(patt) j(item)
sort y1-y4 b f
list y1-y4 b f IRTm1 MIMICm1, nolab clean

       y1   y2   y3   y4   b   f        IRTm1      MIMICm1  
  1.    0    0    0    0   0   0   -1.2242641   -.55473277  
  2.    0    0    0    0   0   1   -1.2242641   -.87608943  
  3.    0    0    0    0   1   0   -1.2242641   -1.4707461  
  4.    0    0    0    0   1   1   -1.2242641   -1.4410752  
  5.    0    0    0    1   0   0      -.14876    .26704248  
  6.    0    0    0    1   0   1      -.14876   -.02615521  
  7.    0    0    0    1   1   0      -.14876   -.54782301  
  8.    0    0    0    1   1   1      -.14876   -.52233467  
  9.    0    0    1    0   0   0   -.37675812    .18398721  
 10.    0    0    1    0   0   1   -.37675812   -.11085131  
 11.    0    0    1    0   1   0   -.37675812   -.63777142  
 12.    0    0    1    0   1   1   -.37675812    -.6119613  
 13.    0    0    1    1   0   0    .60364936    .97836482  
 14.    0    0    1    1   0   1    .60364936    .68778691  
 15.    0    0    1    1   1   0    .60364936    .19041653  
 16.    0    0    1    1   1   1    .60364936    .21416676  
 17.    0    1    0    0   0   0   -.43218061    .09469579  
 18.    0    1    0    0   0   1   -.43218061   -.20221901  
 19.    0    1    0    0   1   0   -.43218061   -.73534825  
 20.    0    1    0    0   1   1   -.43218061   -.70916483  
 21.    0    1    0    1   0   0    .55196233     .8893892  
 22.    0    1    0    1   0   1    .55196233    .59958378  
 23.    0    1    0    1   1   0    .55196233    .10115863  
 24.    0    1    0    1   1   1    .55196233    .12502831  
 25.    0    1    1    0   0   0    .33515561    .80656055  
 26.    0    1    1    0   0   1    .33515561    .51719789  
 27.    0    1    1    0   1   0    .33515561    .01729505  
 28.    0    1    1    0   1   1    .33515561     .0413004  
 29.    0    1    1    1   0   0    1.3035676    1.6228636  
 30.    0    1    1    1   0   1    1.3035676    1.3180372  
 31.    0    1    1    1   1   0    1.3035676    .81295145  
 32.    0    1    1    1   1   1    1.3035676    .83659008  
 33.    1    0    0    0   0   0    -.0351866     .3938274  
 34.    1    0    0    0   0   1    -.0351866    .10259226  
 35.    1    0    0    0   1   0    -.0351866   -.41203827  
 36.    1    0    0    0   1   1    -.0351866   -.38699316  
 37.    1    0    0    1   0   0    .93089004    1.1908602  
 38.    1    0    0    1   0   1    .93089004    .89722329  
 39.    1    0    0    1   1   0    .93089004    .40020559  
 40.    1    0    0    1   1   1    .93089004    .42377722  
 41.    1    0    1    0   0   0    .71352262    1.1065918  
 42.    1    0    1    0   0   1    .71352262    .81436966  
 43.    1    0    1    0   1   0    .71352262    .31757081  
 44.    1    0    1    0   1   1    .71352262    .34119563  
 45.    1    0    1    1   0   0    1.7045824    1.9484446  
 46.    1    0    1    1   0   1    1.7045824    1.6312163  
 47.    1    0    1    1   1   0    1.7045824     1.113084  
 48.    1    0    1    1   1   1    1.7045824    1.1371117  
 49.    1    1    0    0   0   0    .66179647    1.0169609  
 50.    1    1    0    0   0   1    .66179647    .72595346  
 51.    1    1    0    0   1   0    .66179647    .22887171  
 52.    1    1    0    0   1   1    .66179647    .25257845  
 53.    1    1    0    1   0   0    1.6485285    1.8501881  
 54.    1    1    0    1   0   1    1.6485285    1.5370316  
 55.    1    1    0    1   1   0    1.6485285    1.0234149  
 56.    1    1    0    1   1   1    1.6485285    1.0472972  
 57.    1    1    1    0   0   0    1.4181578    1.7596042  
 58.    1    1    1    0   0   1    1.4181578    1.4499547  
 59.    1    1    1    0   1   0    1.4181578     .9400666  
 60.    1    1    1    0   1   1    1.4181578    .96383584  
 61.    1    1    1    1   0   0     2.506591    2.6876937  
 62.    1    1    1    1   0   1     2.506591    2.3322937  
 63.    1    1    1    1   1   0     2.506591    1.7665629  
 64.    1    1    1    1   1   1     2.506591    1.7923462

The scores differ a little from those in Table 9.7.

References

Mislevy, R. J. (1985). Estimation of latent group effects. Journal of the American Statistical Association 80, 993-997.

Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Boca Raton, FL: Chapman & Hall/ CRC Press.

Outline
Datasets and do-files

Generalized Latent Variable Modeling by Skrondal and Rabe-Hesketh Section 9.4: Arithmetic Reasoning One and two-parameter logistic item response models and MIMIC models

Estimate two-parameter logistic IRT model (Table 9.5)

Generalized Latent Variable Modeling by Skrondal and Rabe-Hesketh
Section 9.4: Arithmetic Reasoning

One and two-parameter logistic item response models and MIMIC models