Skip to main content Accessibility help
×
Home

Ecological Regression with Partial Identification

  • Wenxin Jiang (a1) (a2), Gary King (a3), Allen Schmaltz (a3) and Martin A. Tanner (a2)

Abstract

Ecological inference (EI) is the process of learning about individual behavior from aggregate data. We relax assumptions by allowing for “linear contextual effects,” which previous works have regarded as plausible but avoided due to nonidentification, a problem we sidestep by deriving bounds instead of point estimates. In this way, we offer a conceptual framework to improve on the Duncan–Davis bound, derived more than 65 years ago. To study the effectiveness of our approach, we collect and analyze 8,430 $2\times 2$ EI datasets with known ground truth from several sources—thus bringing considerably more data to bear on the problem than the existing dozen or so datasets available in the literature for evaluating EI estimators. For the 88% of real data sets in our collection that fit a proposed rule, our approach reduces the width of the Duncan–Davis bound, on average, by about 44%, while still capturing the true district-level parameter about 99% of the time. The remaining 12% revert to the Duncan–Davis bound.

Copyright

Corresponding author

Footnotes

Hide All

Authors’ note: We thank the editor and anonymous reviewers for their helpful comments. This work was partially supported by the Taishan Scholar Construction Project to W.J. and by the Institute for Quantitative Social Science.

Contributing Editor: Jeff Gill

Footnotes

References

Hide All
Achen, C. H., and Shively, W. P.. 1995. Cross-Level Inference . Chicago: University of Chicago Press.
Altman, M., Gill, J., and McDonald, M.. 2004. A Comparison of the Numerical Properties of EI Methods , edited by King, G., Rosen, O., and Tanner, M. A., 383409. New York: Cambridge University Press.
Centers for Disease Control and Prevention (CDC), National Center for Health Statistics. 2017. Underlying Cause of Death 1999–2016 on CDC WONDER Online Database. Data are from the Multiple Cause of Death Files, 1999–2016, as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program. Accessed at http://wonder.cdc.gov/ucd-icd10.html(retrieved in 2017).
Chambers, R. L., and Steel, D. G.. 2001. “Simple Methods for Ecological Inference in $2\times 2$ Tables.” Journal of Royal Statistical Society Series A 164(Part 1):175192.
Chernozhukov, V., Hong, H., and Tamer, E.. 2007. “Estimation and Confidence Regions for Parameter Sets in Econometric Models.” Econometrica 75:12431284.
Cho, W. K. T., and Manski, C. F.. 2008. “Cross Level/Ecological Inference.” In Oxford Handbook of Political Methodology , edited by Box-Steffensmeier, J., Brady, H., and Collier, D., 530569. Oxford, UK: Oxford University Press.
Duncan, O. D., and Davis, B.. 1953. “An Alternative to Ecological Correlation.” American Sociological Review 18:665666.
Freedman, D. A., Klein, S. P., Sacks, J., Smyth, C. A., and Everett, C. G.. 1991. “Ecological Regression and Voting Rights.” Evaluation Review 15:659817 (with discussion).
Goodman, L. 1953. “Ecological Regression and the Behavior of Individuals.” American Sociological Review 18:663664.
Imai, K., Lu, Y., and Strauss, A.. 2008. “Bayesian and Likelihood Inference for $2\times 2$ Ecological Tables: An Incomplete Data Approach.” Political Analysis 16(1):4169.
Jiang, W., King, G., Schmaltz, A., and Tanner, M. A.. 2019. “Replication Data for: Ecological Regression with Partial Identification,” https://doi.org/10.7910/DVN/W7KZVL, Harvard Dataverse, V1.
Jiang, W., King, G., Schmaltz, A., and Tanner, M. A.. 2018. Ecological Regression with Partial Identification. Technical Report. https://arxiv.org/abs/1804.05803. Replication Data: https://doi.org/10.7910/DVN/8TB7GO, Harvard Dataverse, V1.
King, G. 1997. A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data . Princeton: Princeton University Press.
King, G., Rosen, O., and Tanner, M. A.. 2004. Ecological Inference: New Methodological Strategies . New York: Cambridge University Press.
Liao, Y., and Jiang, W.. 2010. “Bayesian Analysis in Moment Inequality Models.” The Annals of Statistics 38:275316.
Office of the Registrar General & Census Commissioner, India. 2001. Census of India 2001. Accessed at https://data.gov.in(retrieved in 2017).
Owen, G., and Grofman, B.. 1997. “Estimating the Likelihood of Fallacious Ecological Inference: Linear Ecological Regression in the Presence Context Effects.” Political Geography 16:675690.
Ruggles, S., Genadek, K., Goeken, R., Grover, J., and Sobek, M.. 2017. Integrated Public Use Microdata Series: Version 7.0 [dataset] . Minneapolis, MN: University of Minnesota, 2017. https://doi.org/10.18128/D010.V7.0. Accessed at https://usa.ipums.org/usa/ (retrieved in 2018).
Wakefield, J. 2004. “Prior and Likelihood Choices in the Analysis of Ecological Data.” In Ecological Inference: New Methodological Strategies , edited by King, G., Rosen, O., and Tanner, M. A., 1350. New York: Cambridge University Press.
MathJax
MathJax is a JavaScript display engine for mathematics. For more information see http://www.mathjax.org.

Keywords

Type Description Title
UNKNOWN
Supplementary materials

Jiang et al. supplementary material
Jiang et al. supplementary material 1

 Unknown (198 KB)
198 KB

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed