www.gllamm.org

Which survey weights should I use?

Title   Choosing survey weights for gllamm
Author Minjeong Jeon, University of California, Berkeley
Date July 2012

The answer depends on what kind of sampling weights you have in your data set. Suppose you have a dataset consisting of two levels. We can think of the total weights (wij) as product of leve1-2 weights (wj) and level-1 weights (wi|j):

wij = wj × wi|j
where wj is the inverse of the probability that the level-2 unit (or primary sampling unit) was selected and wi|j is the inverse of the conditional probability that the level-1 unit was selected given that the level-2 unit that it belongs to was selected. Use the level-1 and level-2 weights if you have both of them. If only level-2 weights are available (and the level-1 units were sampled from level-2 units with equal probabilities), use the level-2 weights only. Similarly, if you have level-1 weights only and the level-2 units were sampled with equal probabilities, use level-1 weights only.

If you have level-1 weights, do not use them without proper rescaling of the weights. Scaling weights is still an on-going research area; see for example Rabe-Hesketh and Skrondal (2006).

A problem arises when you have only total weights. If you have only total weights, do not use the total weights for pweight() option. Remember that pweight() allows weights for individual levels only. When weights at individual levels are not available, one alternative way is to utilize design variables. By using the design variables as covariates in your regression model, you can still obtain consistent estimates. Another alternative is not to use multilevel modeling but to use ordinary regression models with the svy prefix command.

Examples and documentation

References