Equation (12) is a logit regression. Equation (13) is a linear regression that estimates the impact of the Xs on the square root of the level of spending on each service for individuals with positive spending on that service. We chose the square root transformation to deal with skewness in the distribution of spending rather than the more common logarithmic transformation because the smearing estimator for the square root model is less sensitive to heteroskedasticity than the log transformation.11 The difficulties in retransformation in the context of the two-part model have been treated in detail by Manning (1998) and Mullahy (1998). In those papers, it is shown how sensitive expected spending estimates can be to distributional properties such as heteroskedasticity this.
The use of a transformation to account for skewness in the spending data necessitates use of the “smearing” estimator to retransform the predicted values of spending to the expected levels of spending consistent with the original distributions of spending (Duan et al., 1983). Since this application calls for predicting 1993 spending using 1992 data and coefficients from the two part model of 1992 spending on 1991 right side variables, the smearing factor is taken from the error term of the 1991-1992 regressions. Since we use a square root transformation, the smearing factor is additive as opposed to the multiplicative form in the case of the logarithmic transformation. The resulting empirical analysis consists of a set of 18 regressions for each of the two informational assumptions we make.
Plan Enrollment: We assume that competing managed care plans are in a symmetric equilibrium, and the plan therefore enrolls a representative sample of the population. To estimate plan spending on each service, the Xnimis > numerator of (10), we will simply use the is average spending in the sample.
We summarize the predictions of the 18 two-part models in Table 3 by reporting the correlations between actual and predicted service specific spending levels. This correlation is negatively and monotonically related to the absolute prediction error of the spending model. As expected, correlations between actual and predicted spending are generally quite low for all services when only age and sex related information is known by consumers.
The birth-related correlation between actual and predicted spending is, however, relatively large at 0.21. With prior use, the correlation between predicted and actual spending improves markedly for most services. For example, birth, mental health, hypertension and gastrointestinal conditions all had relatively high correlations between actual and predicted spending when consumers are assumed to know prior level of service specific spending (0.216, 0.306, 0.227 and 0.184 respectively).