### Generalized Latent Variable Modeling by Skrondal and Rabe-Hesketh Section 14.5: Physician advice and drinking Endogenous treatment model

The data set used in this section is kenkel.dat. We would like to thank the Journal of Applied Econometrics for making these data used in Kenkel and Terza (2001) available at Journal of Applied Econometrics Data Archive.

The do-file is kenkel.do.

#### I. Easy method for estimating the model using gllamm wrapper ssm

The programs used are gllamm and ssm.

```
ssm drinks advice black hieduc, s(advice = black hieduc hlthins regmed heart) /*

```

#### II. More difficult method for estimating the model using gllamm

The program used is gllamm. The first three models can also be estimated using Stata's own commands poisson, xtpoisson and probit, but we use gllamm throughout to make the syntax of the final model easier to understand.

Read data and display first ten records:
```
insheet using kenkel.dat, clear
list in 1/10, clean

. list in 1/10, clean

drinks   advice   black   hlthins   regmed   heart   hieduc
1.       60        0       0         1        1       0        1
2.       84        0       0         1        1       0        0
3.        4        0       0         1        1       0        1
4.        0        1       0         1        0       0        1
5.       12        0       0         1        1       0        0
6.       12        0       0         1        1       0        0
7.        5        0       0         1        1       0        1
8.        7        0       0         1        1       0        0
9.        0        0       0         1        0       0        1
10.      1.5        1       0         1        1       1        1
```

The dependent variable for the analysis is the number of alcoholic beverages consumed in the last two weeks. This is calculated as the product of self-reported drinking frequency (the number of days in the past two weeks with any drinking) and drinking intensity (the average number of drinks on a day with any drinking). We round this to the nearest integer to obtain a proper count.
```
replace drinks=round(drinks,1)

```

Collapse data and generate frequency weight variable wt2 to speed up estimation. The gllamm option weight(wt) will ensure that the data are weighted appropriately.
```
disp _N

2467

gen one=1
collapse (sum) wt2=one, by(black hlthins regmed heart hieduc drinks advice)
disp _N

737

gen id=_n
gen cons=1
list in 1/10, clean

drinks   advice   black   hlthins   regmed   heart   hieduc   wt2   id   cons
1.        0        0       0         0        0       0        0     7    1      1
2.        0        1       0         0        0       0        0     3    2      1
3.        1        0       0         0        0       0        0     1    3      1
4.        1        1       0         0        0       0        0     1    4      1
5.        2        0       0         0        0       0        0     2    5      1
6.        2        1       0         0        0       0        0     3    6      1
7.        3        0       0         0        0       0        0     1    7      1
8.        4        0       0         0        0       0        0     4    8      1
9.        5        0       0         0        0       0        0     1    9      1
10.        6        0       0         0        0       0        0     1   10      1
```

Poisson model for drinking for Table 14.7:
```
*/ nocons init

number of level 1 units = 2467

Condition Number = 3.8352842

gllamm model

log likelihood = -32939.148

------------------------------------------------------------------------------
drinks |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
advice |    .473367    .010918    43.36   0.000     .4519681    .4947658
cons |   2.650541   .0084928   312.09   0.000     2.633896    2.667187
hieduc |  -.1826093   .0107983   -16.91   0.000    -.2037736   -.1614451
black |  -.3096866   .0168905   -18.33   0.000    -.3427914   -.2765818
------------------------------------------------------------------------------
```

Overdispersed Poisson model for drinking for Table 14.7 (iteration log not shown):
```

number of level 1 units = 2467
number of level 2 units = 2467

Condition Number = 3.9794068

gllamm model

log likelihood = -8857.8425

------------------------------------------------------------------------------
drinks |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
advice |   .5954294   .0815787     7.30   0.000      .435538    .7553207
cons |   1.429521   .0600222    23.82   0.000      1.31188    1.547162
hieduc |   .0277592   .0740012     0.38   0.708    -.1172806     .172799
black |  -.2835224   .1090915    -2.60   0.009    -.4973378    -.069707
------------------------------------------------------------------------------

Variances and covariances of random effects
------------------------------------------------------------------------------

***level 2 (id)

var(1): 2.899832 (.1132476)
------------------------------------------------------------------------------
```

Probit model for advice for Table 14.7 (iteration log not shown):
```
gllamm advice cons hieduc black hlthins regmed heart, i(id) weight(wt) family(binomial) /*

number of level 1 units = 2467

Condition Number = 6.4702064

gllamm model

log likelihood = -1419.9041

------------------------------------------------------------------------------
advice |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
cons |  -.4785403   .0849039    -5.64   0.000     -.644949   -.3121317
hieduc |  -.2520195   .0560241    -4.50   0.000    -.3618247   -.1422143
black |   .3031406   .0780889     3.88   0.000     .1500891    .4561921
hlthins |  -.2708712   .0704249    -3.85   0.000    -.4089013    -.132841
regmed |   .1801328   .0738763     2.44   0.015     .0353379    .3249278
heart |   .1661613   .0757854     2.19   0.028     .0176246     .314698
------------------------------------------------------------------------------
```

Prepare data for endogeneous treatment model for Table 14.7:

stack drinking and advice into single variable resp and create a variable type = 1 for drinking and type = 2 for advice
```
rename drinks resp1
reshape long resp, i(id) j(type)

(note: j = 1 2)

Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                      737   ->    1474
Number of variables                  11   ->      11
j variable (2 values)                     ->   type
xij variables:
resp1 resp2   ->   resp
-----------------------------------------------------------------------------
```

Create dummies d1 for type=1 (drinking) and d2 for type=2 (advice).
```
tab type, gen(d)
sort id type
list in 1/10, clean

id   type   advice   black   hlthins   regmed   heart   hieduc   wt2   cons   resp   d1   d2
1.    1      1        0       0         0        0       0        0     7      1      0    1    0
2.    1      2        0       0         0        0       0        0     7      1      0    0    1
3.    2      1        1       0         0        0       0        0     3      1      0    1    0
4.    2      2        1       0         0        0       0        0     3      1      1    0    1
5.    3      1        0       0         0        0       0        0     1      1      1    1    0
6.    3      2        0       0         0        0       0        0     1      1      0    0    1
7.    4      1        1       0         0        0       0        0     1      1      1    1    0
8.    4      2        1       0         0        0       0        0     1      1      1    0    1
9.    5      1        0       0         0        0       0        0     2      1      2    1    0
10.    5      2        0       0         0        0       0        0     2      1      0    0    1
```

Create interactions between d1 and covariates in drining model:
```
gen d1_hieduc = d1*hieduc
gen d1_black = d1*black

```

Create interactions between d2 and covariates in advice model (use foreach to save typing):
```
foreach var in hieduc black hlthins regmed heart {
gen d2_`var' = d2*`var'
}

```

Endogenous treatment model for Table 14.7.
```
eq fac: d1 d2
gllamm resp d1_advice d1 d1_hieduc d1_black d2 d2_hieduc d2_black d2_hlthins /*
*/ d2_regmed d2_heart, nocons i(id) weight(wt) family(poisson binom)     /*

number of level 1 units = 4934
number of level 2 units = 2467

Condition Number = 23.071575

gllamm model

log likelihood = -10254.241

------------------------------------------------------------------------------
resp |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
d1_advice |  -2.412288   .2308401   -10.45   0.000    -2.864727    -1.95985
d1 |   2.323782   .0932252    24.93   0.000     2.141064      2.5065
d1_hieduc |  -.2842432    .098614    -2.88   0.004     -.477523   -.0909633
d1_black |   .1103129   .1439681     0.77   0.444    -.1718595    .3924853
d2 |  -1.117855   .1629308    -6.86   0.000    -1.437194   -.7985166
d2_hieduc |  -.3988385    .104094    -3.83   0.000    -.6028591   -.1948179
d2_black |   .6040407   .1528167     3.95   0.000     .3045254     .903556
d2_hlthins |  -.3270483   .0957591    -3.42   0.001    -.5147327   -.1393639
d2_regmed |   .3851485   .1000492     3.85   0.000     .1890557    .5812413
d2_heart |   .5077518   .1105428     4.59   0.000     .2910919    .7244118
------------------------------------------------------------------------------

Variances and covariances of random effects
------------------------------------------------------------------------------

***level 2 (id)

var(1): 2.4751449 (.69051946)

d2: 1 (fixed)
d1: 1.4303603 (.15172818)

------------------------------------------------------------------------------
```

There are some small discrepancies between these estimates and Table 14.7.

References

Kenkel, D. S. and Terza, J. V. (2001). The effect of physician advice on alcohol consumption: Count regression with an endogenous treatment effect. Journal of Applied Econometrics 16, 165-184.

Miranda, A. and Rabe-Hesketh, S. (2005). Maximum likelihood estimation of endogenous switching and sample selection models for binary, count, and ordinal variables. Submitted for publication.

Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2002). Reliable estimation of generalised linear mixed models using adaptive quadrature. The Stata Journal 2, 1-21.

Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Boca Raton, FL: Chapman & Hall/ CRC Press.

Outline
Datasets and do-files