The data set used in this section is kenkel.dat. We would like to thank the Journal of Applied Econometrics for making these data used in Kenkel and Terza (2001) available at Journal of Applied Econometrics Data Archive.
The do-file is kenkel.do.
The programs used are gllamm and ssm.
Use the command ssc describe gllamm and ssc describe ssm and follow instructions to download the programs. For more information on gllamm see http://www.gllamm.org and for more information on ssm see http://www.gllamm.org/wrappers.html.
ssm drinks advice black hieduc, s(advice = black hieduc hlthins regmed heart) /* */ adapt q(16) family(poiss) link(log)
The program used is gllamm. The first three models can also be estimated using Stata's own commands poisson, xtpoisson and probit, but we use gllamm throughout to make the syntax of the final model easier to understand.
Use the command ssc describe gllamm and follow instructions to download gllamm. For more information on gllamm see http://www.gllamm.org.
Read data and display first ten records:
insheet using kenkel.dat, clear
list in 1/10, clean
. list in 1/10, clean
drinks advice black hlthins regmed heart hieduc
1. 60 0 0 1 1 0 1
2. 84 0 0 1 1 0 0
3. 4 0 0 1 1 0 1
4. 0 1 0 1 0 0 1
5. 12 0 0 1 1 0 0
6. 12 0 0 1 1 0 0
7. 5 0 0 1 1 0 1
8. 7 0 0 1 1 0 0
9. 0 0 0 1 0 0 1
10. 1.5 1 0 1 1 1 1
The dependent variable for the analysis is the number of alcoholic beverages consumed in the last two weeks. This is calculated as the product of self-reported drinking frequency (the number of days in the past two weeks with any drinking) and drinking intensity (the average number of drinks on a day with any drinking). We round this to the nearest integer to obtain a proper count.
replace drinks=round(drinks,1)
Collapse data and generate frequency weight variable wt2 to speed up estimation. The gllamm option weight(wt) will ensure that the data are weighted appropriately.
disp _N
2467
gen one=1
collapse (sum) wt2=one, by(black hlthins regmed heart hieduc drinks advice)
disp _N
737
gen id=_n
gen cons=1
list in 1/10, clean
drinks advice black hlthins regmed heart hieduc wt2 id cons
1. 0 0 0 0 0 0 0 7 1 1
2. 0 1 0 0 0 0 0 3 2 1
3. 1 0 0 0 0 0 0 1 3 1
4. 1 1 0 0 0 0 0 1 4 1
5. 2 0 0 0 0 0 0 2 5 1
6. 2 1 0 0 0 0 0 3 6 1
7. 3 0 0 0 0 0 0 1 7 1
8. 4 0 0 0 0 0 0 4 8 1
9. 5 0 0 0 0 0 0 1 9 1
10. 6 0 0 0 0 0 0 1 10 1
Poisson model for drinking for Table 14.7:
gllamm drinks advice cons hieduc black, i(id) weight(wt) family(poisson) link(log) /*
*/ nocons init
number of level 1 units = 2467
Condition Number = 3.8352842
gllamm model
log likelihood = -32939.148
------------------------------------------------------------------------------
drinks | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
advice | .473367 .010918 43.36 0.000 .4519681 .4947658
cons | 2.650541 .0084928 312.09 0.000 2.633896 2.667187
hieduc | -.1826093 .0107983 -16.91 0.000 -.2037736 -.1614451
black | -.3096866 .0168905 -18.33 0.000 -.3427914 -.2765818
------------------------------------------------------------------------------
Overdispersed Poisson model for drinking for Table 14.7 (iteration log not shown):
gllamm drinks advice cons hieduc black, i(id) weight(wt) family(poisson) link(log) /*
*/ nocons adapt nip(10)
number of level 1 units = 2467
number of level 2 units = 2467
Condition Number = 3.9794068
gllamm model
log likelihood = -8857.8425
------------------------------------------------------------------------------
drinks | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
advice | .5954294 .0815787 7.30 0.000 .435538 .7553207
cons | 1.429521 .0600222 23.82 0.000 1.31188 1.547162
hieduc | .0277592 .0740012 0.38 0.708 -.1172806 .172799
black | -.2835224 .1090915 -2.60 0.009 -.4973378 -.069707
------------------------------------------------------------------------------
Variances and covariances of random effects
------------------------------------------------------------------------------
***level 2 (id)
var(1): 2.899832 (.1132476)
------------------------------------------------------------------------------
Probit model for advice for Table 14.7 (iteration log not shown):
gllamm advice cons hieduc black hlthins regmed heart, i(id) weight(wt) family(binomial) /*
*/ link(probit) nocons init
number of level 1 units = 2467
Condition Number = 6.4702064
gllamm model
log likelihood = -1419.9041
------------------------------------------------------------------------------
advice | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cons | -.4785403 .0849039 -5.64 0.000 -.644949 -.3121317
hieduc | -.2520195 .0560241 -4.50 0.000 -.3618247 -.1422143
black | .3031406 .0780889 3.88 0.000 .1500891 .4561921
hlthins | -.2708712 .0704249 -3.85 0.000 -.4089013 -.132841
regmed | .1801328 .0738763 2.44 0.015 .0353379 .3249278
heart | .1661613 .0757854 2.19 0.028 .0176246 .314698
------------------------------------------------------------------------------
Prepare data for endogeneous treatment model for Table 14.7:
stack drinking and advice into single variable resp and create a variable type = 1 for drinking and type = 2 for advice
rename drinks resp1
gen resp2 = advice
reshape long resp, i(id) j(type)
(note: j = 1 2)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 737 -> 1474
Number of variables 11 -> 11
j variable (2 values) -> type
xij variables:
resp1 resp2 -> resp
-----------------------------------------------------------------------------
Create dummies d1 for type=1 (drinking) and d2 for type=2 (advice).
tab type, gen(d)
sort id type
list in 1/10, clean
id type advice black hlthins regmed heart hieduc wt2 cons resp d1 d2
1. 1 1 0 0 0 0 0 0 7 1 0 1 0
2. 1 2 0 0 0 0 0 0 7 1 0 0 1
3. 2 1 1 0 0 0 0 0 3 1 0 1 0
4. 2 2 1 0 0 0 0 0 3 1 1 0 1
5. 3 1 0 0 0 0 0 0 1 1 1 1 0
6. 3 2 0 0 0 0 0 0 1 1 0 0 1
7. 4 1 1 0 0 0 0 0 1 1 1 1 0
8. 4 2 1 0 0 0 0 0 1 1 1 0 1
9. 5 1 0 0 0 0 0 0 2 1 2 1 0
10. 5 2 0 0 0 0 0 0 2 1 0 0 1
Create interactions between d1 and covariates in drining model:
gen d1_advice = d1*advice gen d1_hieduc = d1*hieduc gen d1_black = d1*black
Create interactions between d2 and covariates in advice model (use foreach to save typing):
foreach var in hieduc black hlthins regmed heart {
gen d2_`var' = d2*`var'
}
Endogenous treatment model for Table 14.7.
eq fac: d1 d2
gllamm resp d1_advice d1 d1_hieduc d1_black d2 d2_hieduc d2_black d2_hlthins /*
*/ d2_regmed d2_heart, nocons i(id) weight(wt) family(poisson binom) /*
*/ link(log probit) fv(type) lv(type) eq(fac) adapt nip(15)
number of level 1 units = 4934
number of level 2 units = 2467
Condition Number = 23.071575
gllamm model
log likelihood = -10254.241
------------------------------------------------------------------------------
resp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
d1_advice | -2.412288 .2308401 -10.45 0.000 -2.864727 -1.95985
d1 | 2.323782 .0932252 24.93 0.000 2.141064 2.5065
d1_hieduc | -.2842432 .098614 -2.88 0.004 -.477523 -.0909633
d1_black | .1103129 .1439681 0.77 0.444 -.1718595 .3924853
d2 | -1.117855 .1629308 -6.86 0.000 -1.437194 -.7985166
d2_hieduc | -.3988385 .104094 -3.83 0.000 -.6028591 -.1948179
d2_black | .6040407 .1528167 3.95 0.000 .3045254 .903556
d2_hlthins | -.3270483 .0957591 -3.42 0.001 -.5147327 -.1393639
d2_regmed | .3851485 .1000492 3.85 0.000 .1890557 .5812413
d2_heart | .5077518 .1105428 4.59 0.000 .2910919 .7244118
------------------------------------------------------------------------------
Variances and covariances of random effects
------------------------------------------------------------------------------
***level 2 (id)
var(1): 2.4751449 (.69051946)
loadings for random effect 1
d2: 1 (fixed)
d1: 1.4303603 (.15172818)
------------------------------------------------------------------------------
There are some small discrepancies between these estimates and Table 14.7.
Kenkel, D. S. and Terza, J. V. (2001). The effect of physician advice on alcohol consumption: Count regression with an endogenous treatment effect. Journal of Applied Econometrics 16, 165-184.
Miranda, A. and Rabe-Hesketh, S. (2005). Maximum likelihood estimation of endogenous switching and sample selection models for binary, count, and ordinal variables. Submitted for publication.
Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2002). Reliable estimation of generalised linear mixed models using adaptive quadrature. The Stata Journal 2, 1-21.
Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized
Latent Variable Modeling: Multilevel, Longitudinal and Structural
Equation Models. Boca Raton, FL: Chapman & Hall/ CRC Press.
Outline
Datasets and do-files