Skip to main content Accessibility help

Fixed effects in rare events data: a penalized maximum likelihood solution

  • Scott J. Cook (a1), Jude C. Hays (a2) and Robert J. Franzese (a3)


Most agree that models of binary time-series-cross-sectional data in political science often possess unobserved unit-level heterogeneity. Despite this, there is no clear consensus on how best to account for these potential unit effects, with many of the issues confronted seemingly misunderstood. For example, one oft-discussed concern with rare events data is the elimination of no-event units from the sample when estimating fixed effects models. Many argue that this is a reason to eschew fixed effects in favor of pooled or random effects models. We revisit this issue and clarify that the main concern with fixed effects models of rare events data is not inaccurate or inefficient coefficient estimation, but instead biased marginal effects. In short, only evaluating event-experiencing units gives an inaccurate estimate of the baseline risk, yielding inaccurate (often inflated) estimates of predictor effects. As a solution, we propose a penalized maximum likelihood fixed effects (PML-FE) estimator, which retains the complete sample by providing finite estimates of the fixed effects for each unit. We explore the small sample performance of PML-FE versus common alternatives via Monte Carlo simulations, evaluating the accuracy of both parameter and effects estimates. Finally, we illustrate our method with a model of civil war onset.


Corresponding author

*Corresponding author. Email:


Hide All
Acemoglu, D, Johnson, S, Robinson, JA Yared, P (2008) Income and Democracy. American Economic Review 98, 808842.
Beck, N (2011) Is OLS with a Binary Dependent Variable Really OK? Estimating (Mostly) TSCS Models with Binary Dependent Variables and Fixed Effects. Working Paper, Annual Meeting of the Society of Political Methodology.
Beck, N (2015) Estimating Grouped Data Models with a Binary Dependent Variable and Fixed Effects: What Are the Issues? Working Paper, Annual Meeting of the Society of Political Methodology. Available at
Beck, N Katz, JN (2001) Throwing Out the Baby with the Bath Water: A Comment on Green, Kim, and Yoon. International Organization 55, 487495.
Bell, A Jones, K (2015) Explaining Fixed Effects: Random Effects Modeling of Time-Series Cross-Sectional and Panel Data. Political Science Research and Methods 3, 133153.
Buhaug, H Gleditsch, KS (2008) Contagion or Confusion? Why Conflicts Cluster in Space1. International Studies Quarterly 52, 215233.
Chamberlain, G (1980) Analysis of Covariance with Qualitative Data. The Review of Economic Studies 47, 225238.
Chassang, S Padro-i Miquel, G (2009) Economic Shocks and Civil War. Quarterly Journal of Political Science 4, 211228.
Clark, TS Linzer, DA (2015) Should I Use Fixed or Random Effects? Political Science Research and Methods 3, 399408.
Collier, P Hoeffler, A (2004) Greed and Grievance in Civil War. Oxford Economic Papers 56, 563595.
Cook, SJ, Blas, B, Carroll, RJ Sinha, S (2017) Two Wrongs Make a Right: Addressing Underreporting in Binary Data from Multiple Sources. Political Analysis 25, 223240.
Cook, SJ McGrath, LF (n.d.) Unit Heterogeneity in Large Datasets with Rare Events. Working Paper, In Progress.
Cook, SJ, Niehaus, J Zuhlke, S (2018) A Warning on Separation in Multinomial Logistic Regression. Research & Politics 5, 15.
Copas, JB (1988) Binary Regression Models for Contaminated Data. Journal of the Royal Statistical Society, Series B 50, 225265.
Fearon, J (2008) Economic Development, Insurgency, and Civil War. In E Helpman (ed), Institutions and Economic Performance. Cambridge, MA: Harvard University Press.
Fearon, JD Laitin, DD (2003) Ethnicity, Insurgency, and Civil War. American Political Science Review 97, 7590.
Firth, D (1993) Bias Reduction of Maximum Likelihood Estimates. Biometrika 80, 2738.
Gelman, A, Jakulin, A, Pittau, MG, Su, Y-S et al. (2008) A Weakly Informative Default Prior Distribution for Logistic and Other Regression Models. The Annals of Applied Statistics 2, 13601383.
Green, DP, Kim, SY Yoon, DH (2001) Dirty Pool. International Organization 55, 441468.
Greene, W (2004) The Behaviour of the Maximum Likelihood Estimator of Limited Dependent Variable Models in the Presence of Fixed Effects. The Econometrics Journal 7, 98119.
Heckman, JJ (1981) The Incidental Parameters Problem and the Problem of Initial Conditions in Estimating a Discrete Time-Discrete Data Stochastic Process and Some Monte Carlo Evidence. In C Manski and D McFadden (eds), Structural Analysis of Discrete Data with Econometric Applications, 114178. Cambridge, MA: MIT University Press.
Hegre, H Sambanis, N (2006) Sensitivity Analysis of Empirical Results on Civil War Onset. Journal of Conflict Resolution 50, 508535.
Heinze, G Schemper, M (2002) A Solution to the Problem of Separation in Logistic Regression. Statistics in Medicine 21, 24092419.
King, G (2001) Proper Nouns and Methodological Propriety: Pooling Dyads in International Relations Data. International Organization 55, 497507.
Lancaster, T (2000) The Incidental Parameter Problem Since 1948. Journal of Econometrics 95, 391413.
Lesaffre, E Spiessens, B (2001) On the Effect of the Number of Quadrature Points in a Logistic Random Effects Model: An Example. Journal of the Royal Statistical Society: Series C (Applied Statistics) 50, 325335.
McGrath, LF (2018) Problems with Penalised Maximum Likelihood and Jeffrey’s Priors to Account for Separation in Large Datasets with Rare Events. Available at, accessed 4 May 2018.
Nel, P Righarts, M (2008) Natural Disasters and the Risk of Violent Civil Conflict. International Studies Quarterly 52, 159185.
Neyman, J Scott, EL (1948) Consistent Estimates Based on Partially Consistent Observations. Econometrica: Journal of the Econometric Society, 1–32.
Oneal, JR Russett, B (2001) Clear and Clean: The Fixed Effects of the Liberal Peace. International Organization 55, 469485.
Plümper, T Troeger, VE (2007) Efficient Estimation of Time-Invariant and Rarely Changing Variables in Finite Sample Panel Analyses with Unit Fixed Effects. Political Analysis 15, 124139.
Plümper, T Troeger, VE (2011) Fixed-Effects Vector Decomposition: Properties, Reliability, and Instruments. Political Analysis 19, 147164.
Rainey, C (2016) Dealing with Separation in Logistic Regression Models. Political Analysis 24, 339355.
Sambanis, N (2001) Do Ethnic and Nonethnic Civil Wars Have the Same Causes?: A Theoretical and Empirical Inquire (Part 1). Journal of Conflict Resolution 45, 259282.
Wright, J (2009) How Foreign Aid Can Foster Democratization in Authoritarian Regimes. American Journal of Political Science 53, 552571.
Zorn, C (2005) A Solution to Separation in Binary Response Models. Political Analysis 13, 157170.
Type Description Title
Supplementary materials

Cook et al. Dataset

Supplementary materials

Cook et al. supplementary material
Cook et al. supplementary material 1

 PDF (104 KB)
104 KB


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed