Skip to main content Accessibility help
Hostname: page-component-768ffcd9cc-5sfl8 Total loading time: 0.711 Render date: 2022-11-29T22:26:40.762Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "displayNetworkTab": true, "displayNetworkMapGraph": false, "useSa": true } hasContentIssue true

Looking for twins: how to build better counterfactuals with matching

Published online by Cambridge University Press:  09 February 2021

Stefano Costalli
Department of Political and Social Sciences, Università degli Studi di Firenze, Firenze, Italy
Fedra Negri*
Department of Social and Political Sciences, Università degli Studi di Milano, Milano, Italy
*Corresponding author. Email:
Get access


A primary challenge for researchers that make use of observational data is selection bias (i.e. the units of analysis exhibit systematic differences and dis-homogeneities due to non-random selection into treatment). This article encourages researchers in acknowledging this problem and discusses how and – more importantly – under which assumptions they may resort to statistical matching techniques to reduce the imbalance in the empirical distribution of pre-treatment observable variables between the treatment and control groups. With the aim of providing a practical guidance, the article engages with the evaluation of the effectiveness of peacekeeping missions in the case of the Bosnian civil war, a research topic in which selection bias is a structural feature of the observational data researchers have to use, and shows how to apply the Coarsened Exact Matching (CEM), the most widely used matching algorithm in the fields of Political Science and International Relations.

Research Article
Copyright © The Author(s), 2021. Published by Cambridge University Press on behalf of the Società Italiana di Scienza Politica.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Angrist, JD and Pischke, JS (2009) Mostly Harmless Econometrics. Princeton: Princeton University Press.CrossRefGoogle Scholar
Arceneaux, K, Gerber, AS and Green, DP (2006) Comparing experimental and matching methods using a large-scale voter mobilization study. Political Analysis 14, 3762.CrossRefGoogle Scholar
Barnow, BS, Cain, GG and Goldberger, AS (1980) Issues in the analysis of selectivity bias. In Stromsdorfer, E and Farkas, G (eds), Evaluation Studies, vol. 5. San Francisco: Sage Publications, pp. 4359.Google Scholar
Bates, MA and Glennerster, R (2017) The generalizability puzzle. Stanford Social Innovation Review, 5054. Scholar
Beck, N and Katz, J (1995) What to do (and not to do) with time-series cross-section data. American Political Science Review 89, 634–47.CrossRefGoogle Scholar
Beck, N and Katz, J (1996) Nuisance vs. substance: specifying and estimating time-series cross-section models. Political Analysis 6, 136.CrossRefGoogle Scholar
Becker, SO and Ichino, A (2002) Estimation of average treatment effects based on propensity scores. The Stata Journal 2, 358–77.CrossRefGoogle Scholar
Caliendo, M and Kopeinig, S (2008) Some practical guidance for the implementation of propensity score matching. Journal of Economic Surveys 22, 3172.CrossRefGoogle Scholar
Carpenter, D (2002) Groups, the media, agency waiting costs, and FDA drug approval. American Journal of Political Science 46, 490505.CrossRefGoogle Scholar
Cederman, LE and Gleditsch, KS (2009) Introduction to special issue on disaggregating civil war. Journal of Conflict Resolution 53, 487–95.CrossRefGoogle Scholar
Cederman, LE, Gleditsch, KS and Buhaug, H (2013) Inequality, Grievances and Civil War. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Cochran, WG and Chambers, SP (1965) The planning of observational studies of human populations. Journal of Royal Statistical Society, Series A 128, 234–65.CrossRefGoogle Scholar
Corbetta, P (2003) Social Research. Theory, Methods and Techniques. London, Thousand Oaks, New Delhi: Sage Publications.CrossRefGoogle Scholar
Costalli, S (2014) Does peacekeeping work? A disaggregated analysis of deployment and violence reduction in the Bosnian war. British Journal of Political Science 44, 357380.CrossRefGoogle Scholar
Costalli, S and Moro, FN (2011) La violenza nelle guerre civili: un'analisi quantitativa della violenza in Bosnia-Erzegovina. Rivista Italiana di Scienza Politica 41, 526.Google Scholar
Costalli, S and Moro, FN (2012) Ethnicity and strategy in the Bosnian civil war: explanations for the severity of violence in Bosnian municipalities. Journal of Peace Research 49, 801–15.CrossRefGoogle Scholar
Costalli, S, Moro, FN and Ruggeri, A (2020) The logic of vulnerability and civilian victimization. Shifting front lines in Italy (1943–1945). World Politics 72, 679718.CrossRefGoogle Scholar
Cox, DR (1958) Planning of Experiments. New York: John Wiley.Google Scholar
Crump, RK, Hotz, VJ, Imbens, GW and Mitnik, O (2009) Dealing with limited overlap in estimation of average treatment effects. Biometrika 96, 187199.CrossRefGoogle Scholar
Dehejia, RH and Wahba, S (1999) Causal effects in nonexperimental studies: re-evaluating the evaluation of training programs. Journal of the American Statistical Association 94, 10531062.CrossRefGoogle Scholar
Dehejia, RH and Wahba, S (2002) Propensity score matching methods for nonexperimental causal studies. Review of Economics and Statistics 84, 151161.CrossRefGoogle Scholar
Doyle, M and Sambanis, N (2006) Making War and Building Peace: United Nations Peace Operations. Princeton: Princeton University Press.CrossRefGoogle Scholar
Duflo, E, Glennerster, R and Kremer, M (2008) Using randomization in development economics research: a toolkit. Handbook of Development Economics 4, 38953962.CrossRefGoogle Scholar
Fjelde, H and Hultman, L (2014) Weakening the enemy: a disaggregated study of violence against civilians in Africa. Journal of Conflict Resolution 58, 12301257.CrossRefGoogle Scholar
Fortna, VP (2008) Does Peacekeeping Work? Shaping Belligerents’ Choice after Civil War. Princeton: Princeton University Press.CrossRefGoogle Scholar
Gleditsch, KS (2007) Transnational dimensions of civil war. Journal of Peace Research 44, 293309.CrossRefGoogle Scholar
Gleditsch, NP, Wallensteen, P, Eriksson, M, Sollenberg, M and Strand, H (2002) Armed conflict 1946–2001. A new dataset. Journal of Peace Research 39, 615637.CrossRefGoogle Scholar
Goldberger, A (1991) A Course in Econometrics. Cambridge: Harvard University Press.Google Scholar
Heckman, J and Robb, R (1985 print publication - 2013 online publication) Alternative methods for evaluating the impacts of interventions. In Heckman, J and Singer, B (eds), Longitudinal Analysis of Labor Market Data. Cambridge: Cambridge University Press, 156-246.CrossRefGoogle Scholar
Heckman, J, Ichimura, H and Todd, P (1997) Matching as an econometric evaluation estimator: evidence from evaluating a job training programme. Review of Economic Studies 64, 605654.CrossRefGoogle Scholar
Heckman, J, Ichimura, H and Todd, P (1998) Matching as an econometric evaluation estimator. Review of Economic Studies 65, 261294.CrossRefGoogle Scholar
Heinmueller, J (2012) Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Political Analysis 20, 2546.CrossRefGoogle Scholar
Heinmueller, J (2012) Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies. Political Analysis 20, 2546CrossRefGoogle Scholar
Ho, DE, Imai, K, King, G and Stuart, EA (2007) Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis 15, 199236.CrossRefGoogle Scholar
Holland, PW (1986) Statistics and causal inference. Journal of the American Statistical Association 8, 945–60.CrossRefGoogle Scholar
Hultman, L, Kathman, J and Shannon, M (2014) Beyond keeping peace: united nations effectiveness in the midst of fighting. The American Political Science Review 108, 737753.CrossRefGoogle Scholar
Iacus, SM, King, G and Porro, G (2009) cem: Software for Coarsened Exact Matching. Journal of Statistical Software 30(9), 1-27. Available at Scholar
Iacus, SM, King, G and Porro, G (2011) Multivariate matching methods that are monotonic imbalance bounding. Journal of the American Statistical Association 106, 345361.CrossRefGoogle Scholar
Iacus, SM, King, G and Porro, G (2012) Causal inference without balance checking: coarsened exact matching. Political Analysis 20, 124.CrossRefGoogle Scholar
Iacus, SM, King, G and Porro, G (2019) A theory of statistical inference for matching methods in causal research. Political Analysis 27, 4668.CrossRefGoogle Scholar
Imbens, GM (2004) Nonparametric estimation of average treatment effects under exogeneity: a review. Review of Economics and Statistics 86, 429.CrossRefGoogle Scholar
Imbens, GM and Wooldridge, JM (2008) Recent developments in the econometrics of program evaluation. NBER Working Paper No. 14251. Available at Scholar
Kalyvas, S (2006) The Logic of Violence in Civil War. Cambridge, New York: Cambridge University Press.CrossRefGoogle Scholar
Keele, L (2015) The statistics of causal inference: a view from political methodology. Political Analysis 23, 313335.CrossRefGoogle Scholar
Khandker, SR, Koolwal, GB and Samad, HA (2010) Handbook on impact evaluation: quantitative methods and practices. World Bank. © World Bank. Available at License: CC BY 3.0 IGO.Google Scholar
King, G and Nielsen, R (2019) Why propensity scores should not be used for matching. Political Analysis 27, 435454.CrossRefGoogle Scholar
King, G and Zeng, L (2006) The dangers of extreme counterfactuals. Political Analysis 14, 131–59. Available at Scholar
LaLonde, R (1986) Evaluating the econometric evaluations of training programs. American Economic Review 76, 604620.Google Scholar
Manski, CF (2007) Identification for Prediction and Decision. Cambridge, MA: Harvard University Press.Google Scholar
Martini, A and Sisti, M (2009) Valutare il Successo Delle Politiche Pubbliche. Bologna: Il Mulino.Google Scholar
Oster, E (2019) Unobservable selection and coefficient stability: theory and evidence. Journal of Business & Economic Statistics 37, 187204.CrossRefGoogle Scholar
Petterson, T, Hogbladh, S and Oberg, M (2019) Organized violence 1989–2018 and peace agreements. Journal of Peace Research 56, 589603.CrossRefGoogle Scholar
Rosenbaum, PR (1984) The consequences of adjusting for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society, Series A 147, 656–66.CrossRefGoogle Scholar
Rosenbaum, PR (2002) Observational Studies. New York: Springer.CrossRefGoogle Scholar
Rosenbaum, PR and Rubin, DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70, 4155.CrossRefGoogle Scholar
Rubin, DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 6, 688701.CrossRefGoogle Scholar
Rubin, DB (2010) On the limitations of comparative effectiveness research. Statistics in Medicine 29, 19911995.CrossRefGoogle ScholarPubMed
Ruggeri, A, Dorussen, H and Gizelis, T (2017) Winning the peace locally: UN peacekeeping and local conflict. International Organization 71, 163185.CrossRefGoogle Scholar
Smith, H (1997) Matching with multiple controls to estimate treatment effects in observational studies. Sociological Methodology 27, 325353.CrossRefGoogle Scholar
Smith, JA and Todd, PE (2005) Does matching overcome LaLonde's critique of nonexperimental estimators? Journal of Econometrics 125, 305–53.CrossRefGoogle Scholar
Walter, B (1997) The critical barrier to civil War settlement. International Organization 51, 335364.CrossRefGoogle Scholar
Young, A (2019) Channelling Fisher: randomization tests and the statistical insignificance of seemingly significant experimental results. The Quarterly Journal of Economics 134, 557598.CrossRefGoogle Scholar
Supplementary material: Link

Costalli and Negri Dataset

Cited by

Save article to Kindle

To save this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Looking for twins: how to build better counterfactuals with matching
Available formats

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox.

Looking for twins: how to build better counterfactuals with matching
Available formats

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive.

Looking for twins: how to build better counterfactuals with matching
Available formats

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *