How do I create frequency weights to speed up gllamm?
||Creating frequency weights for gllamm
Minjeong Jeon and Sophia Rabe-Hesketh, University of California, Berkeley
Using frequency weights is a very useful and also an easy way to speed up
For instance, if you have several identical level-2 units, by using level-2 weights,
gllamm could become enormously faster than without using the level-2 weights.
Using frequency weights means that the data are in collapsed form.
Thus, creating frequency weights is the same as collapsing the data
collapse command can be used to generate frequency weights.
Suppose there is a dataset that contains students nested within schools.
In the imaginary dataset, you have four variables
schid stuid y sex
stuid are school and student identifiers, respectively.
sex are a binary response variable and a binary explanatory variable for the students.
Many students in the same school will have the same response and sex;
therefore, we can collapse the data and create level-1 frequency weights using the commands:
collapse (count) wt1 = cons, by(schid sex y)
collapse command creates the weight variable
by counting the number of cases in the combination of
wt1 represents the number of cases who have the same response and sex in the same school.
Level-2 frequency weights are most likely to be useful when the response
variable is binary and the number of level-1 units per level-2 unit
is small. Examples are longitudinal data or item responses in item response models.
The weights represent the number of level-2 units with the
same set of responses and covariate values for its level-1 units.
The data should be in wide form, with one row of data for each
level-2 unit and separate variables for each level-1 unit.
For example, consider a longitudinal dataset in wide form with variables
y1 y2 y3 y4 sex
y4 are the responses at time-points
1 to 4 and
sex is a binary explanatory variable.
Many subjects will have the same sex and the same responses at the four
time-points; therefore we can collapse the data and create level-2
frequency weights using the commands:
collapse (count) wt2 = cons, by(y1 y2 y3 y4 sex)
Before we can run
gllamm, we must reshape the data to
generate pattern = _n
reshape long y, i(pattern) j(occasion)
pattern is now the new level-2 identifier that should
be used in the
i() option of the
Examples and documentation
- Description of
weight() option on p.22-23,
and examples in Sections 3.2.2, 4.1.1-4.1.2, 8.3.1-8.3.2, 8.4, and 9.3
of Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2004).
U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 160.
- Skrondal, A. and Rabe-Hesketh, S. (2004).
Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Chapman & Hall/CRC.
- Exercises 10.3, 10.7, 14.5, and 16.11 and p.929-930 in
Rabe-Hesketh, S. and Skrondal, A. (2012).
Multilevel and Longitudinal Modeling Using Stata (Third Edition).
Volume II: Categorical Responses, Counts, and Survival.
College Station, TX: Stata Press.