Improving Predictions using Ensemble Bayesian Model Averaging

Jacob M. Montgomery; Florian M. Hollenbach; Michael D. Ward

doi:10.1093/pan/mps002

Improving Predictions using Ensemble Bayesian Model Averaging

Published online by Cambridge University Press: 04 January 2017

Jacob M. Montgomery ,

Florian M. Hollenbach and

Michael D. Ward

Show author details

Jacob M. Montgomery: Affiliation:
Department of Political Science, Washington University in St Louis, Campus Box 1063, One Brookings Drive, St Louis, MO 63130-4899
Florian M. Hollenbach: Affiliation:
Department of Political Science, Duke University, Perkins Hall 326, Box 90204, Durham, NC 27707-4330
Michael D. Ward*: Affiliation:
Department of Political Science, Duke University, Perkins Hall 326, Box 90204, Durham, NC 27707-4330
*: e-mail: michael.d.ward@duke.edu (corresponding author)

Article contents

Abstract
Footnotes
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We present ensemble Bayesian model averaging (EBMA) and illustrate its ability to aid scholars in the social sciences to make more accurate forecasts of future events. In essence, EBMA improves prediction by pooling information from multiple forecast models to generate ensemble predictions similar to a weighted average of component forecasts. The weight assigned to each forecast is calibrated via its performance in some validation period. The aim is not to choose some “best” model, but rather to incorporate the insights and knowledge implicit in various forecasting efforts via statistical postprocessing. After presenting the method, we show that EBMA increases the accuracy of out-of-sample forecasts relative to component models in three applied examples: predicting the occurrence of insurgencies around the Pacific Rim, forecasting vote shares in U.S. presidential elections, and predicting the votes of U.S. Supreme Court Justices.

Type: Research Article
Information: Political Analysis , Volume 20 , Issue 3 , Summer 2012 , pp. 271 - 291

DOI: https://doi.org/10.1093/pan/mps002 [Opens in a new window]
Copyright: Copyright © The Author 2012. Published by Oxford University Press on behalf of the Society for Political Methodology

Footnotes

Authors' note: For generously sharing their data and models with us, we thank Alan Abramowitz, James Campbell, Robert Erikson, Ray Fair, Douglas Hibbs, Michael Lewis-Beck, Andrew D. Martin, Kevin Quinn, Stephen Shellman, Charles Tien, and Christopher Wlezien. We especially want to thank Adrian Raftery and Brendan Nyhan for their encouragement and feedback as this project evolved. The editor and the reviewers of Political Analysis provided especially salient suggestions that substantially improved our research.

References

Abramowitz, A. I. 2008. Forecasting the 2008 presidential election with the time-for-change model. PS: Political Science & Politics 41: 691–5.Google Scholar

Andriole, S. J., and Young, R. A. 1977. Toward the development of an integrated crisis warning system. International Studies Quarterly 21: 107–50.Google Scholar

Armstrong, J. S. 2001. Combining forecasts. In Principles of forecasting: A handbook for researchers and practitioners, ed. Amstrong, J. S. Norwell, MA: Kluwer Academic.Google Scholar

Ascher, W. 1978. Forecasting: An appraisal for policy-makers and planners. Baltimore: Johns Hopkins University Press.Google Scholar

Bartels, L. M. 1997. Specification uncertainty and model averaging. American Journal of Political Science 41: 641–74.CrossRef Google Scholar

Bartels, L. M., and Zaller, J. 2001. Presidential vote models: A recount. PS: Political Science and Politics 34: 9–20.Google Scholar

Bates, J., and Granger, C. 1969. The combination of forecasts. Operations Research 20: 451–68.Google Scholar

Bennett, D. S., and Stam, A. C. 2009. Revisiting predictions of war duration. Conflict Management and Peace Science 26: 256–67.CrossRef Google Scholar

Berg, J. E., Nelson, F. D., and Rietz, T. A. 2008. Prediction market accuracy in the long run. International Journal of Forecasting 24: 285–300.Google Scholar

Berrocal, V. J., Raftery, A. E., Gneiting, T., and Steed, R. C. 2010. Probabilistic weather forecasting for winter road maintenance. Journal of the American Statistical Association 105: 522–37.Google Scholar

Billio, M., Casarin, R., Ravazzolo, F., and Van Dijk, H. K. 2010. Combining predictive densities using Bayesian filtering with applications to U.S. economics data. Norges Bank Working Paper. http://ssrn.com/abstract=1735421 (accessed June 1, 2011).Google Scholar

Billio, M., Casarin, R., Ravazzolo, F., and Van Dijk, H. K. 2011. Bayesian combinations of stock price predictions with an application to the Amsterdam exchange index. Tinbergen Institute Discussion Paper No. 2011-082/4. http://www.tinbergen.nl/discussionpapers/11082.pdf (accessed June 1, 2011).CrossRef Google Scholar

Brandt, P. T., Colaresi, M., and Freeman, J. R. 2008. The dynamics of reciprocity, accountability, and credibility. Journal of Conflict Resolution 52: 343–74.Google Scholar

Brandt, P. T., Freeman, J. R., and Schrodt, P. A. 2011a. Racing horses: Constructing and evaluating forecasts in political science. Paper prepared for the 28th Annual Summer Meeting of the Society for Political Methodology. http://polmeth.wustl.edu/media/Paper/RHMethods20110721small_1.pdf (accessed August 20, 2011).Google Scholar

Brandt, P. T., Freeman, J. R., and Schrodt, P. A. 2011b. Real-time, time-series forecasting of inter- and intra-state political conflict. Conflict Management and Peace Science 28: 41–64.Google Scholar

Brier, G. W. 1950. Verification of forecasts expressed in terms of probability. Monthly Weather Review 78: 1–3.2.0.CO;2>CrossRef Google Scholar

Brock, W. A., Durlauf, S. N., and West, K. D. 2007. Model uncertainty and policy evaluation: Some theory and empirics. Journal of Econometrics 136: 629–64.Google Scholar

Brown, L. B., and Chappell, H. W. 1999. Forecasting presidential elections using history and polls. International Journal of Forecasting 15: 127–35.Google Scholar

Bueno de Mesquita, B. 2002. Predicting politics. Columbus: Ohio State University Press.Google Scholar

Bueno de Mesquita, B. 2011. A new model for predicting policy choices: Preliminary tests. Conflict Management and Peace Science 28: 65–85.Google Scholar

Campbell, J. E. 1992. Forecasting the presidential vote in the states. American Journal of Political Science 36: 386–407.Google Scholar

Campbell, J. E. 2008. The trial-heat forecast of the 2008 presidential vote: Performance and value considerations in an open-seat election. PS: Political Science & Politics 41: 697–701.Google Scholar

Campbell, J. E., and Wink, K. A. 1990. Trial-heat forecasts of the presidential vote. American Politics Research 18: 251–69.Google Scholar

Chmielecki, R. M., and Raftery, A. E. 2010. Probabilistic visibility forecasting using Bayesian model averaging. Monthly Weather Review 139: 1626–36.Google Scholar

Choucri, N., and Robinson, T. W., eds. 1978. Forecasting in international relations: Theory, methods, problems, prospects. San Francisco, CA: W. H. Freeman.Google Scholar

Clyde, M. 2003. Model averaging. In Subjective and objective Bayesian statistics: Principles, models, and applications, ed. Press, S. J., 320–35. Hoboken, NJ: Wiley-Interscience.Google Scholar

Clyde, M., and George, E. I. 2004. Model uncertainty. Statistical Science 19: 81–94.Google Scholar

Cuzàn, A. G., and Bundrick, C. M. 2008. Forecasting the 2008 presidential election: A challenge for the fiscal model. PS: Political Science & Politics 41: 717–22.Google Scholar

Davies, J. L., and Gurr, T. R. 1998. Preventive measures: Building risk assessment and crisis early warning systems. Lanham, MD: Rowman & Littlefield.Google Scholar

Dawid, A. P. 1982. The well-calibrated Bayesian (with discussion). Journal of the American Statistical Association 77: 605–13.Google Scholar

Dawid, A. P. 1984. Present position and potential developments: Some personal views. Statistical theory: The prequential approach (with discussion). Journal of the Royal Statistical Society Series A (Statistics in Society) 147: 278–92.Google Scholar

de Marchi, S., Gelpi, C., and Grynaviski, J. D. 2004. Untangling neural nets. American Political Science Review 98: 371–8.Google Scholar

de Sola Pool, I., Abelson, R. P., and Popkin, S. L. 1964. Candidates, issues, and strategies: A computer simulation of the 1960 and 1964 presidential elections. Cambridge, MA: MIT Press.Google Scholar

Draper, D. 1995. Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society Series B (Methodological) 57: 45–97.Google Scholar

Enders, W., and Sandler, T. M. 2005. After 9/11: Is it all different now? Journal of Conflict Resolution 49: 259–77.Google Scholar

Erikson, R. S., and Wlezien, C. 2008. Leading economic indicators, the polls, and the presidential vote. PS: Political Science & Politics 41: 703–7.Google Scholar

Fair, R. C. 1978. The effect of economic events on votes for president. Review of Economics and Statistics 60: 159–73.Google Scholar

Fair, R. C. 2010. Presidential and congressional vote-share equations: November 2010 update. Working paper, Yale University. http://fairmodel.econ.yale.edu/RAYFAIR/PDF/2010C.pdf (accessed June 7, 2011).Google Scholar

Fearon, J. D., and Laitin, D. D. 2003. Ethnicity, insurgency, and civil war. American Political Science Review 97: 75–90.Google Scholar

Feder, S. A. 2002. Forecasting for policy-making in the post-Cold War period. Annual Review of Political Science 5: 111–25.Google Scholar

Feldkircher, M. Forthcoming 2012. Forecast combination and Bayesian model averaging: A prior sensitivity analysis. Journal of Forecasting.Google Scholar

Fraley, C., Raftery, A. E., and Gneiting, T. 2010. Calibrating multimodel forecast ensembles with exchangeable and missing members using Bayesian model averaging. Monthly Weather Review 138: 190–202.Google Scholar

Fraley, C., Raftery, A. E., Gneiting, T., Sloughter, J. M., and Berrocal, V. J. 2011. Probabilistic weather forecasting in R. R Journal 3: 55–63.CrossRef Google Scholar

Fraley, C., Raftery, A. E., Sloughter, J. M., and Gneiting, T. 2010. EnsembleBMA: Probabilistic forecasting using ensembles and Bayesian model averaging. R package version 4.5. http://CRAN.R-project.org/package=ensembleBMA.Google Scholar

Freeman, J. R., and Job, B. L. 1979. Scientific forecasts in international relations: Problems of definition and epistemology. International Studies Quarterly 23: 113–43.Google Scholar

Geer, J., and Lau, R. R. 2006. Filling in the blanks: A new method for estimating campaign effects. British Journal of Political Science 36: 269–90.Google Scholar

Gill, J. 2004. Introduction to the special issue. Political Analysis 12: 647–74.Google Scholar

Gleditsch, K. S., and Ward, M. D. 2010. Contentious issues and forecasting interstate disputes. Presented at the 2010 Annual Meeting of the International Studies Association, New Orleans, LA.Google Scholar

Gneiting, T., and Raftery, A. E. 2007. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102: 359–78.Google Scholar

Gneiting, T., and Thorarinsdottir, T. L. 2010. Predicting inflation: Professional experts versus no-change forecasts. Working paper. http://arxiv.org/abs/1010.2318v1http://arxiv.org/abs/1010.2318v1 (accessed June 15, 2011).Google Scholar

Graefe, A., Cuzan, A. G., Jones, R. J., and Armstrong, J. S. 2010. Combining forecasts for U.S. presidential elections: The PollyVote. Working Paper. http://dl.dropbox.com/u/3662406/Articles/Graefe_et_al_Combining.pdf (accessed May 15, 2011).Google Scholar

Greenhill, B. D., Ward, M. D., and Sacks, A. 2011. The separation plot: A new visual method for evaluating the fit of binary data. American Journal of Political Science 55: 990–1002.Google Scholar

Gurr, T. R., and Lichbach, M. I. 1986. Forecasting internal conflict: A competitive evaluation of empirical theories. Comparative Political Studies 19: 3–38.Google Scholar

Hamill, T. S., Whitaker, J. S., and Wei, X. 2004. Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Monthly Weather Review 132: 1434–47.Google Scholar

Hastie, T., Tibshirani, R., and Friedman, J. 2009. The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.Google Scholar

Hausegger, L., and Baum, L. 1999. Inviting congressional action: A study of Supreme Court motivations in statutory interpretation. American Journal of Political Science 43: 162–85.Google Scholar

Hibbs, D. A. 2000. Bread and peace voting in U.S. presidential elections. Public Choice 104: 149–80.Google Scholar

Hildebrand, D. K., Laing, J. D., and Rosenthal, H. 1976. Prediction analysis in political research. American Political Science Review 70: 509–35.Google Scholar

Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, C. T. 1999. Bayesian model averaging: A tutorial. Statistical Science 14: 382–417.Google Scholar

Holbrook, T. M. 2008. Incumbency, national conditions, and the 2008 presidential election. PS: Political Science & Politics 41: 709–12.Google Scholar

Huisman, J., Breuer, L., Bormann, H., Bronstert, A., Croke, B., Frede, H.-G., Gräff, T., Hubrechts, L., Jakeman, A., Kite, G., et al. 2009. Assessing the impact of land-use change on hydrology by ensemble modeling (LUCHEM) II: Ensemble combinations and predictions. Advances in Water Resources 32: 147–58.Google Scholar

Imai, K., and King, G. 2004. Did illegal overseas absentee ballots decide the 2000 U.S. presidential election? Perspectives on Politics 2: 537–49.Google Scholar

Jerome, B., Jerome, V., and Lewis-Beck, M. S. 1999. Polls fail in France: Forecasts of the 1997 legislative election. International Journal of Forecasting 15: 163–74.Google Scholar

King, G., and Zeng, L. 2001. Improving forecasts of state failure. World Politics 53: 623–58.Google Scholar

Klein, D. E., and Hume, R. J. 2003. Fear of reversal as an explanation of lower court compliance. Law & Society Review 37: 579–606.CrossRef Google Scholar

Koop, G., and Korobilis, D. 2009. Forecasting inflation using dynamic model averaging. Working paper. http://personal.strath.ac.uk/gary.koop/koop_korobilis_forecasting_inflation_using_DMA.pdf (accessed May 25, 2011).Google Scholar

Krause, G. A. 1997. Voters, information heterogeneity, and the dynamics of aggregate economic expectations. American Journal of Political Science 41: 1170–200.CrossRef Google Scholar

Leblang, D., and Satyanath, S. 2006. Institutions, expectations, and currency crises. International Organization 60: 245–62.Google Scholar

Lewis-Beck, M. S. 2005. Election forecasting: Principles and practice. British Journal of Politics & International Relations 7: 145–64.CrossRef Google Scholar

Lewis-Beck, M. S., and Tien, C. 2008. The job of president and the jobs model forecast: Obama for '08? PS: Political Science & Politics 41: 687–90.Google Scholar

Lock, K., and Gelman, A. 2010. Bayesian combination of state polls and election forecasts. Political Analysis 18: 337–48.Google Scholar

Lockerbie, B. 2008. Election forecasting: The future of the presidency and the house. PS: Political Science & Politics 41: 713–6.Google Scholar

Madigan, D., and Raftery, A. E. 1994. Model selection and accounting for model uncertainty in graphical models using Occam's window. Journal of the American Statistical Association 89: 1535–46.Google Scholar

Marshall, M. G., Jaggers, K., and Gurr, T. R. 2009. Polity IV project: Political regime characteristics and transition 1800-2007. College Park, MD: CIDCM, University of Maryland.Google Scholar

Martin, A. D., Quinn, K. M., Ruger, T. W., and Kim, P. T. 2004. Competing approaches to predicting Supreme Court decision-making. Perspectives on Politics 2: 761–7.Google Scholar

McCandless, T. C., Haupt, S. E., and Young, G. S. 2011. The effects of imputing missing data on ensemble temperature forecasts. Journal of Computers 6: 162–71.Google Scholar

McCormick, T. H., Raftery, A. E., Madigan, D., and Burd, R. S. 2011. Dynamic logistic regression and dynamic model averaging for binary classification. Working paper. http://www.stat.columbia.edu/madigan/PAPERS/ldbma27.pdf (accessed March 26, 2011).Google Scholar

Min, S.-K., and Hense, A. 2006. A Bayesian approach to climate model evaluation and multi-model averaging with an application to global mean surface temperatures from IPCC AR4 coupled climate models. Geophysical Research Letters 33: L08708.CrossRef Google Scholar

Min, S.-K., Simonis, D., and Hense, A. 2007. Probabilistic climate change predictions applying Bayesian model averaging. Philosophical Transactions of the Royal Society A: Mathematical, Physical, and Engineering Sciences 365: 2103–16.Google Scholar

Montgomery, J. M., Hollenbach, F., and Ward, M. D. 2012. Replication data for: Improving predictions using ensemble Bayesian model averaging. IQSS Dataverse Network. http://hdl.handle.net/1902.1/17286.Google Scholar

Montgomery, J. M., and Nyhan, B. 2010. Bayesian model averaging: Theoretical developments and practical applications. Political Analysis 18: 245–70.Google Scholar

Muhlbaier, M. D., and Polikar, R. 2007. An ensemble approach for incremental learning in nonstationary environments. Multiple Classifier Systems 4472: 490–500.Google Scholar

Norpoth, H. 2008. On the razor's edge: The forecast of the primary model. PS: Political Science & Politics 41: 683–6.Google Scholar

O'Brien, S. P. 2002. Anticipating the good, the bad, and the ugly: An early warning approach to conflict and instability analysis. Journal of Conflict Resolution 46: 791–811.Google Scholar

O'Brien, S. P. 2010. Crisis early warning and decision support: Contemporary approaches and thoughts on future research. International Studies Review 12: 87–104.Google Scholar

Page, S. E. 2008. Uncertainty, difficulty, and complexity. Journal of Theoretical Politics 20: 115–49.Google Scholar

Page, S. E. 2011. Diversity and complexity. Princeton, NJ: Princeton University Press.Google Scholar

Page, S. E., Sander, L. M., and Schneider-Mizell, C. M. 2007. Conformity and dissonance in generalized voter models. Journal of Statistical Physics 128: 1279–87.Google Scholar

Pevehouse, J. C., and Goldstein, J. S. 1999. Serbian compliance or defiance in Kosovo? Statistical analysis and real-time predictions. Journal of Conflict Resolution 43: 538–46.Google Scholar

Raftery, A. E. 1995. Bayesian model selection in social research. Sociological Methodology 25: 111–63.Google Scholar

Raftery, A. E., Gneiting, T., Balabdaoui, F., and Polakowski, M. 2005. Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133: 1155–74.Google Scholar

Raftery, A. E., Kárný, M., and Ettler, P. 2010. Online prediction under model uncertainty via dynamic model averaging: Application to a cold rolling mill. Technometrics 52: 52–66.Google Scholar

Raftery, A. E., and Zheng, Y. 2003. Long-run performance of Bayesian model averaging. Journal of the American Statistical Association 98: 931–8.Google Scholar

Richards, M. J., and Kritzer, H. M. 2002. Jurisprudential regimes in Supreme Court decision-making. American Political Science Review 96: 305–20.Google Scholar

Rosenstone, S. J. 1983. Forecasting presidential elections. New Haven, CT: Yale University Press.Google Scholar

Ruger, T. W., Kim, P. T., Martin, A. D., and Quinn, K. M. 2004. The Supreme Court Forecasting Project: Legal and political science approaches to predicting Supreme Court decision-making. Columbia Law Review 104: 1150–210.CrossRef Google Scholar

Schneider, G., Gleditsch, N. P., and Carey, S. 2011. Forecasting in international relations: One quest, three approaches. Conflict Management and Peace Science 28: 5–14.Google Scholar

Schrodt, P. A., and Gerner, D. J. 2000. Using cluster analysis to derive early warning indicators for political change in the Middle East, 1979-1996. American Political Science Review 94: 803–18.Google Scholar

Segal, J. A., and Cover, A. D. 1989. Ideological values and the votes of U.S. Supreme Court Justices. American Political Science Review 83: 557–65.Google Scholar

Singer, J. D., and Wallace, M. D. 1979. To augur well: Early warning indicators in world politics. Beverly Hills, CA: Sage.Google Scholar

Sloughter, J. M., Gneiting, T., and Raftery, A. E. 2010. Probabilistic wind-speed forecasting using ensembles and Bayesian model averaging. Journal of the American Statistical Association 105: 25–35.Google Scholar

Sloughter, J. M., Raftery, A. E., Gneiting, T., and Fraley, C. 2007. Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Monthly Weather Review 135: 3209–20.Google Scholar

Smith, R. L., Tebaldi, C., Nychka, D., and Mearns, L. O. 2009. Bayesian modeling of uncertainty in ensembles of climate models. Journal of the American Statistical Association 104: 97–116.Google Scholar

Songer, D. R., Segal, J. A., and Cameron, C. M. 1994. The hierarchy of justice: Testing a principal-agent model of Supreme Courtcircuit court interactions. American Journal of Political Science 38: 673–96.Google Scholar

Spirtes, P., Glymour, C. N., and Scheines, R. 2000. Causation, prediction, and search. Vol. 81. Cambridge, MA: MIT Press.Google Scholar

Tomas, A. 2011. A dynamic logistic multiple classifier system for online classification. Working paper. http://www.stats.ox.ac.uk/tomas/html_links/T2011.pdf (accessed June 1, 2011).Google Scholar

Vincent, J. E. 1980. Scientific prediction versus crystal ball gazing: Can the unknown be known? International Studies Quarterly 24: 450–4.Google Scholar

Vrugt, J. A., Clark, M. P., Diks, C. G., Duan, Q., and Robinson, B. A. 2006. Multi-objective calibration of forecast ensembles using Bayesian model averaging. Geophysical Research Letters 33: L19817.Google Scholar

Vrugt, J. A., Diks, C. G., and Clark, M. P. 2008. Ensemble Bayesian model averaging using Markov chain Monte Carlo sampling. Environmental Fluid Mechanics 8: 579–95.Google Scholar

Ward, M. D., Greenhill, B. D., and Bakke, K. M. 2010. The perils of policy by p-value: Predicting civil conflict. Journal of Peace Research 47: 363–75.Google Scholar

Ward, M. D., Siverson, R. M., and Cao, X. 2007. Disputes, democracies, and dependencies: A re-examination of the Kantian peace. American Journal of Political Science 51: 583–601.Google Scholar

Whiteley, P. F. 2005. Forecasting seats from votes in British general elections. British Journal of Politics & International Relations 7: 165–73.Google Scholar

Wright, J. H. 2008. Bayesian model averaging and exchange rate forecasts. Journal of Econometrics 146: 329–41.Google Scholar

Wright, J. H. 2009. Forecasting U.S. inflation by Bayesian model averaging. Journal of Forecasting 28: 131–44.Google Scholar

Zhang, X., Srinivasan, R., and Bosch, D. 2009. Calibration and uncertainty analysis of the SWAT model using genetic algorithms and Bayesian model averaging. Journal of Hydrology 374: 307–17.Google Scholar

Article contents

Improving Predictions using Ensemble Bayesian Model Averaging

Abstract

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests