Hostname: page-component-cd4964975-pf4mj Total loading time: 0 Render date: 2023-04-02T11:27:49.451Z Has data issue: true Feature Flags: { "useRatesEcommerce": false } hasContentIssue true

Latent Class Modeling with Covariates: Two Improved Three-Step Approaches

Published online by Cambridge University Press:  04 January 2017

Jeroen K. Vermunt*
Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, The Netherlands. e-mail:


Researchers using latent class (LC) analysis often proceed using the following three steps: (1) an LC model is built for a set of response variables, (2) subjects are assigned to LCs based on their posterior class membership probabilities, and (3) the association between the assigned class membership and external variables is investigated using simple cross-tabulations or multinomial logistic regression analysis. Bolck, Croon, and Hagenaars (2004) demonstrated that such a three-step approach underestimates the associations between covariates and class membership. They proposed resolving this problem by means of a specific correction method that involves modifying the third step. In this article, I extend the correction method of Bolck, Croon, and Hagenaars by showing that it involves maximizing a weighted log-likelihood function for clustered data. This conceptualization makes it possible to apply the method not only with categorical but also with continuous explanatory variables, to obtain correct tests using complex sampling variance estimation methods, and to implement it in standard software for logistic regression analysis. In addition, a new maximum likelihood (ML)—based correction method is proposed, which is more direct in the sense that it does not require analyzing weighted data. This new three-step ML method can be easily implemented in software for LC analysis. The reported simulation study shows that both correction methods perform very well in the sense that their parameter estimates and their SEs can be trusted, except for situations with very poorly separated classes. The main advantage of the ML method compared with the Bolck, Croon, and Hagenaars approach is that it is much more efficient and almost as efficient as one-step ML estimation.

Research Article
Copyright © The Author 2010. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Bandeen-Roche, Karen, Miglioretti, Diana L., Zeger, Scott L., and Rathouz, Paul J. 1997. Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association 92: 1375–86.CrossRefGoogle Scholar
Blaydes, Lisa, and Linzer, Drew A. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60: 579609.CrossRefGoogle Scholar
Bolck, Annabel, Croon, Marcel A., and Hagenaars, Jacques A. 2004. Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis 12: 327.CrossRefGoogle Scholar
Breen, Richard. 2000. Why is support for extreme parties underestimated by surveys? A latent class analysis. British Journal of Political Science 30: 375–82.CrossRefGoogle Scholar
Chung, Hwan, Flaherty, Brian P., and Schafer, Joseph L. 2006. Latent class logistic regression: Application to marijuana use and attitudes among high school seniors. Journal of the Royal Statistical Society Series A—Statistics in Society 169: 723–43.CrossRefGoogle Scholar
Clogg, Clifford C. 1981. New developments in latent structure analysis. In Factor analysis and measurement in sociological research, ed. Jackson, D. J. and Borgotta, E. F., 215–46. Beverly Hills, CA: Sage.Google Scholar
Collins, Linda M., and Wugalter, Stuart E. 1992. Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research 27: 131–57.CrossRefGoogle Scholar
Croon, Marcel A. 2002. Using predicted latent scores in general latent structure models. In Latent variable and latent structure models, ed. Marcoulides, George A. and Moustaki, Irini, 195224. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
Dalton, Russell J. 2006. The two faces of citizenship. Democracy & Society 3: 21–3.Google Scholar
Dalton, Russell J. 2008. Citizenship norms and the expansion of political participation. Political Studies 56: 7698.CrossRefGoogle Scholar
Dayton, C. Mitchell, and Macready, Geoffrey B. 1988. Concomitant-variable latent-class models. Journal of the American Statistical Association 83: 173–8.CrossRefGoogle Scholar
Dias, José G., and Vermunt, Jeroen K. 2008. A bootstrap-based aggregate classifier for model-based clustering. Computational Statistics 23: 643–59.CrossRefGoogle Scholar
Edlund, Jonas. 2006. Trust in the capability of the welfare state and general welfare state support: Sweden 1997-2002. Acta Sociologica 49: 395417.CrossRefGoogle Scholar
Feick, Lawrence F. 1989. Latent class analysis of survey questions that include don't know responses. Public Opinion Quarterly 53: 525–47.CrossRefGoogle Scholar
Galindo-Garre, Francisca, and Vermunt, Jeroen K. 2006. Avoiding boundary estimates in latent class analysis by Bayesian posterior mode estimation. Behaviormetrika 33: 4359.CrossRefGoogle Scholar
Garrett, Elisabeth S., and Zeger, Scott L. 2000. Latent class model diagnosis. Biometrics 56: 1055–67.CrossRefGoogle ScholarPubMed
Garrett, Elisabeth S., Eaton, William W., and Zeger, Scott L. 2002. Methods for evaluating the performance of diagnostic tests in the absence of a gold standard: A latent class model approach. Statistics in Medicine 21: 1289–307.CrossRefGoogle ScholarPubMed
Goodman, Leo A. 1974a. The analysis of systems of qualitative variables when some of the variables are unobservable: Part I—A modified latent structure approach. American Journal of Sociology 79: 1179–259.CrossRefGoogle Scholar
Goodman, Leo A. 1974b. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61: 215–31.CrossRefGoogle Scholar
Goodman, Leo A. 2007. On the assignment of individuals to classes. Sociological Methodology 37: 122.CrossRefGoogle Scholar
Haberman, Shelby J. 1979. Analysis of qualitative data, Vol. 2: New developments. New York: Academic Press.Google Scholar
Hagenaars, Jacques A. 1990. Categorical longitudinal data—Loglinear analysis of panel, trend and cohort data. Newbury Park, CA: Sage.Google Scholar
Hagenaars, Jacques A. 1993. Loglinear models with latent variables. Newbury Park, CA: Sage.CrossRefGoogle Scholar
Hill, Jennifer L., and Kriesi, Hanspeter. 2001a. Classification by opinion-changing behavior: A mixture model approach. Political Analysis 9: 301–24.CrossRefGoogle Scholar
Hill, Jennifer L., and Kriesi, Hanspeter. 2001b. An extension and test of converse's ‘black-and-white’ model of response stability. American Political Science Review 95: 397413.CrossRefGoogle Scholar
Howard, Marc M., Gibson, James L., and Stolle, Dietlind. 2005. The U.S. Citizenship, Involvement, Democracy survey. Center for Democracy and Civil Society, Georgetown University.Google Scholar
Kamakura, Wagner A., Wedel, Michel, and Agrawal, Jagdish. 1994. Concomitant variable latent class models for the external analysis of choice data. International Journal of Marketing Research 11: 451–64.CrossRefGoogle Scholar
Katz, Jonathan N., and Katz, Gabriel. 2009. Reassessing the link between voter heterogeneity and political accountability: A latent class regression model of economic voting. Paper presented at the 26th Annual Society for Political Methodology Summer Conference, July 23-25, 2009, Yale University.Google Scholar
Lazarsfeld, Paul F., and Henry, Neil W. 1968. Latent structure analysis. Boston, MA: Houghton Mill.Google Scholar
Linzer, Drew A. 2006. A comparative analysis of ideological constraint using latent class models. Paper presented at the annual meeting of the Midwest Political Science Association, Palmer House Hilton, Chicago, IL, April 20, 2006.Google Scholar
Lu, Irene R.R., and Roland Thomas, D. 2008. Avoiding and correcting bias in score-based latent variable regression with discrete manifest items. Structural Equation Modeling 15: 462–90.CrossRefGoogle Scholar
Magidson, Jay. 1981. Qualitative variance, entropy, and correlation ratios for nominal dependent variables. Social Science Research 10: 177–94.CrossRefGoogle Scholar
Magidson, Jay, and Vermunt, Jeroen K. 2001. Latent class factor and cluster models, bi-plots and related graphical displays. Sociological Methodology 31: 223–64.CrossRefGoogle Scholar
McCutcheon, Allan L. 1985. A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly 49: 474–88.CrossRefGoogle Scholar
McCutcheon, Allan L. 1987. Latent class analysis. Newbury Park, CA: Sage.CrossRefGoogle Scholar
McLachlan, Geoffrey J., and Peel, David. 2000. Finite mixture models. New York: Wiley.CrossRefGoogle Scholar
Moors, Guy, and Vermunt, Jeroen K. 2007. Heterogeneity in postmaterialist value priorities. Evidence from a latent class discrete choice approach. European Sociological Review 23: 631–48.CrossRefGoogle Scholar
Muthén, Linda K., and Muthén, Bengt O., 2004. Mplus3.0: User's manual. Los Angeles, CA: Muthén and Muthén.Google Scholar
Patterson, Blossom H., Mitchell Dayton, C., and Graubard, Barry I. 2002. Latent class analysis of complex sample survey data: Application to dietary data. Journal of the American Statistical Association 97: 721–8.CrossRefGoogle Scholar
Rubin, Donald B. 1987. Multiple imputation for nonresponse in surveys. New York: Wiley.CrossRefGoogle Scholar
Schafer, Joseph L. 1997. Analysis of incomplete multivariate data. London: Chapman & Hall.CrossRefGoogle Scholar
Skinner, Chris J., Holt, Tim, and Fred Smith, T. M. 1989. Analysis of complex surveys. New York: Wiley.Google Scholar
Simmons, Solon. 2008. Ascriptive justice: The prevalence, distribution, and consequences of political correctness in the academy. Forum 6: 8.Google Scholar
Skrondal, Anders, and Laake, Petter. 2001. Regression among factor scores. Psychometrika 88: 563–76.Google Scholar
Van den Hout, Ardo, and Van der Heijden, Peter G. M. 2004. The analysis of multivariate misspecified data, with special attention to randomized response data. Sociological Methods and Research 32: 310–36.CrossRefGoogle Scholar
Van de Pol, Frank, and Langeheine, Rolf. 1990. Mixed Markov latent class models. Sociological Methodology 20: 213–47.CrossRefGoogle Scholar
Van der, Heijden, Zvi Gilula, Peter G. M., and Andries Van der Ark, L. 1999. An extended study into the relationship between correspondence analysis and latent class analysis. Sociological Methodology 29: 147–86.Google Scholar
Vermunt, Jeroen K. 1997. Log-linear models for event histories. Advanced quantitative techniques in the social sciences series. Thousand Oaks, CA: Sage.Google Scholar
Vermunt, Jeroen K. 2003. Multilevel latent class models. Sociological Methodology 33: 213–39.CrossRefGoogle Scholar
Vermunt, Jeroen K. 2005. Mixed-effects logistic regression models for indirectly observed outcome variables. Multivariate Behavioral Research 40: 281301.CrossRefGoogle ScholarPubMed
Vermunt, Jeroen K. 2008. Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research 17: 3351.CrossRefGoogle ScholarPubMed
Vermunt, Jeroen K., Langeheine, Rolf, and Böckenholt, Ulf. 1999. Discrete-time discrete-state latent Markov models with time-constant and time-varying covariates. Journal of Educational and Behavioral Statistics 24: 178205.CrossRefGoogle Scholar
Vermunt, Jeroen K., and Magidson, Jay. 2004. Latent class analysis. In The Sage encyclopedia of social science research methods, ed. Lewis-Beck, Michael, Bryman, Alan, and Liao, Tim F., 549–53. Newbury Park, CA: Sage.Google Scholar
Vermunt, Jeroen K., and Magidson, Jay. 2005. Latent GOLD 4.0 user's guide. Belmont, MA: Statistical Innovations.Google Scholar
Vermunt, Jeroen K., and Magidson, Jay. 2008. LG-Syntax user's guide: Manual for Latent GOLD 4.5 syntax module. Belmont, MA: Statistical Innovations.Google Scholar
Yamaguchi, Kazuo. 2000. Multinomial logit latent-class regression models: An analysis of the predictors of gender-role attitudes among Japanese women. American Journal of Sociology 105: 1702–40.CrossRefGoogle Scholar