Skip to main content Accessibility help

Investigating data-driven biological subtypes of sychiatric disorders using specification-curve analysis

  • Lian Beijers (a1), Hanna M. van Loo (a1), Jan-Willem Romeijn (a2), Femke Lamers (a3), Robert A. Schoevers (a1) (a4) and Klaas J. Wardenaar (a1)...



Cluster analyses have become popular tools for data-driven classification in biological psychiatric research. However, these analyses are known to be sensitive to the chosen methods and/or modelling options, which may hamper generalizability and replicability of findings. To gain more insight into this problem, we used Specification-Curve Analysis (SCA) to investigate the influence of methodological variation on biomarker-based cluster-analysis results.


Proteomics data (31 biomarkers) were used from patients (n = 688) and healthy controls (n = 426) in the Netherlands Study of Depression and Anxiety. In SCAs, consistency of results was evaluated across 1200 k-means and hierarchical clustering analyses, each with a unique combination of the clustering algorithm, fit-index, and distance metric. Next, SCAs were run in simulated datasets with varying cluster numbers and noise/outlier levels to evaluate the effect of data properties on SCA outcomes.


The real data SCA showed no robust patterns of biological clustering in either the MDD or a combined MDD/healthy dataset. The simulation results showed that the correct number of clusters could be identified quite consistently across the 1200 model specifications, but that correct cluster identification became harder when the number of clusters and noise levels increased.


SCA can provide useful insights into the presence of clusters in biomarker data. However, SCA is likely to show inconsistent results in real-world biomarker datasets that are complex and contain considerable levels of noise. Here, the number and nature of the observed clusters may depend strongly on the chosen model-specification, precluding conclusions about the existence of biological clusters among psychiatric patients.

  • View HTML
    • Send article to Kindle

      To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Investigating data-driven biological subtypes of sychiatric disorders using specification-curve analysis
      Available formats

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Investigating data-driven biological subtypes of sychiatric disorders using specification-curve analysis
      Available formats

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Investigating data-driven biological subtypes of sychiatric disorders using specification-curve analysis
      Available formats


This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (, which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

Corresponding author

Author for correspondence: Lian Beijers, E-mail:


Hide All
Ahmad, A., & Fröhlich, H. (2016). Integrating heterogeneous omics data via statistical inference and learning techniques. Genomics and Computational Biology, 2(1), e32.
Batterham, R. L., Le Roux, C. W., Cohen, M. A., Park, A. J., Ellis, S. M., Patterson, M., … Bloom, S. R. (2003). Pancreatic polypeptide reduces appetite and food intake in humans. Journal of Clinical Endocrinology and Metabolism, 88(8), 39893992.
Beijers, L., Wardenaar, K. J., Bosker, F. J., Lamers, F., Van Grootheest, G., De Boer, M. K., … Schoevers, R. A. (2019a). Biomarker-based subtyping of depression and anxiety disorders using latent class analysis. A NESDA study. Psychological Medicine, 49(4), 617627.
Beijers, L., Wardenaar, K. J., van Loo, H. M., & Schoevers, R. A. (2019b). Data-driven biological subtypes of depression: Systematic review of biological approaches to depression subtyping. Molecular Psychiatry, 24, 888999.
Bloom, J., & Al-Abed, Y. (2014). MIF: Mood improving/inhibiting factor? Journal of Neuroinflammation, 11, 11.
Borsboom, D., Rhemtulla, M., Cramer, A. O. J., Van Der Maas, H. L. J., Scheffer, M., & Dolan, C. V. (2016). Kinds versus continua: A review of psychometric approaches to uncover the structure of psychiatric constructs. Psychological Medicine, 45(8), 15671579.
Bot, M., Chan, M. K., Jansen, R., Lamers, F., Vogelzangs, N., Steiner, J., … Bahn, S. (2015). Serum proteomic profiling of major depressive disorder. Translational Psychiatry, 5, e599.
Chand, G. B., Dwyer, D. B., Erus, G., Sotiras, A., Varol, E., Srinivasan, D., … Davatzikos, C. (2019). T195. Neuroanatomical heterogeneity of schizophrenia quantified via semi-supervised machine learning reveals two distinct subtypes: Results from the PHENOM consortium. Biological Psychiatry, 85(10), S205S206.
Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). Nbclust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 136.
Clifford, H., Wessely, F., Pendurthi, S., & Emes, R. D. (2011). Comparison of clustering methods for investigation of genome-wide methylation array data. Frontiers in Genetics, 2(88), 111.
Domenici, E., Willé, D. R., Tozzi, F., Propenko, I., Miller, S., McKeown, A., … Muglia, P. (2010). Plasma protein biomarkers for depression and schizophrenia by multi analyte profiling of case-control collections. PLoS ONE, 5(2), e9166.
e Silva, J. A. C. (2013). Personalized medicine in psychiatry: New technologies and approaches. Metabolism: Clinical and Experimental, 62, S40S44.
Ferreira, L., & Hitchcock, D. B. (2009). A comparison of hierarchical methods for clustering functional data. Communications in Statistics: Simulation and Computation, 38(9), 19251949.
Flint, J., & Kendler, K. S. (2014). The genetics of major depression. Neuron, 81(3), 484503.
Georgiades, S., Szatmari, P., & Boyle, M. (2013). Importance of studying heterogeneity in autism. Neuropsychiatry, 3(2), 123125.
Hagenaars, J. A. (1988). Latent structure models with direct effects between indicators: Local dependence models. Sociological Methods & Research, 16(3), 379405.
Hands, S., & Everitt, B. (1987). A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques. Multivariate Behavioral Research, 22(2), 235243.
Hasler, G., & Northoff, G. (2011). Discovering imaging endophenotypes for major depression. Molecular Psychiatry, 16(6), 604619.
Hastie, T., Tibshirani, R., & Friedman, J. (2011). The elements of statistical learning: Data mining, inference, and prediction, Second Edition (Springer Series in Statistics) (9780387848570): Trevor Hastie, Robert Tibshirani, Jerome Friedman: Books. In The elements of statistical learning: Dta mining, inference, and prediction (pp. 501520). New York: Springer.
Islam, A., Alizadeh, B. Z., & van den Heuvel, E. R., & GROUP investigators. (2015). A comparison of indices for identifying the number of clusters in hierarchical clustering: A study on cognition in schizophrenia patients. Communications in Statistics: Case Studies, Data Analysis and Applications, 1(2), 98133.
Islam, M. A., Habtewold, T. D., van Es, F. D., Quee, P. J., van den Heuvel, E. R., Alizadeh, B. Z., … van Winkel, R. (2018). Long-term cognitive trajectories and heterogeneity in patients with schizophrenia and their unaffected siblings. Acta Psychiatrica Scandinavica, 138(6), 591604.
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651666.
John, C. R., Watson, D., Russ, D., Goldmann, K., Ehrenstein, M., Lewis, M., … Barnes, M. (2020). M3c: A Monte Carlo reference-based consensus clustering algorithm. Scientific Reports, 10, 1816.
Johnson, W., Li, C., & Rabinovic, A. (2007). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8(1), 118127.
Kapur, S., Phillips, A. G., & Insel, T. R. (2012). Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? Molecular Psychiatry, 17(12), 11741179.
Kendell, R. E. (1989). Clinical validity. Psychological Medicine, 19(1), 4455.
Kendell, R., & Jablensky, A. (2003). Distinguishing between the validity and utility of psychiatric diagnoses. American Journal of Psychiatry, 160(1), 412.
Kendler, K. S. (2009). An historical framework for psychiatric nosology. Psychological Medicine, 39(12), 19351941.
Kofler, M. J., Sarver, D. E., Spiegel, J. A., Day, T. N., Harmon, S. L., & Wells, E. L. (2017). Heterogeneity in ADHD: Neurocognitive predictors of peer, family, and academic functioning. Child Neuropsychology, 23(6), 733759.
Lewandowski, K. E., Baker, J. T., McCarthy, J. M., Norris, L. A., & Öngür, D. (2018). Reproducibility of cognitive profiles in psychosis using cluster analysis. Journal of the International Neuropsychological Society, 24(4), 382390.
Librenza-Garcia, D., Kotzian, B. J., Yang, J., Mwangi, B., Cao, B., Pereira Lima, L. N., … Passos, I. C. (2017). The impact of machine learning techniques in the study of bipolar disorder: A systematic review. Neuroscience and Biobehavioral Reviews, 80, 538554.
Lin, E., & Hsien-Yuan, L. (2017). Machine learning and systems genomics approaches for multi-omics data. Biomarker Research, 5(2).
Lombardo, M. V, Lai, M.-C., & Baron-Cohen, S. (2019). Big data approaches to decomposing heterogeneity across the autism spectrum. Molecular Psychiatry, 24, 14351450.
Maes, M., Bosmans, E., De Jongh, R., Kenis, G., Vandoolaeghe, E., & Neels, H. (1997). Increased serum IL-6 and IL-1 receptor antagonist concentrations in major depression and treatment resistant depression. Cytokine, 9(11), 853858.
Marquand, A. F., Wolfers, T., Mennes, M., Buitelaar, J., & Beckmann, C. F. (2016). Beyond lumping and splitting: A review of computational approaches for stratifying psychiatric disorders. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 1(5), 433447.
McCaffery, J. M., Duan, Q. L., Frasure-Smith, N., Barhdadi, A., Lespérance, F., Théroux, P., … Dubé, P.-T. (2009). Genetic predictors of depressive symptoms in cardiac patients. American Journal of Medical Genetics, Part B: Neuropsychiatric Genetics, 150B(3), 381388.
Milaneschi, Y., Corsi, A. M., Penninx, B. W., Bandinelli, S., Guralnik, J. M., & Ferrucci, L. (2009). Interleukin-1 receptor antagonist and incident depressive symptoms over 6 years in older persons: The InCHIANTI study. Biological Psychiatry, 65(11), 973978.
Milligan, G. W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45, 325342.
Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(2), 159179.
Monroe, S. M., & Anderson, S. F. (2015). Depression: The shroud of heterogeneity. Current Directions in Psychological Science, 24(3), 227231.
Mostert, J. C., Hoogman, M., Onnink, A. M. H., van Rooij, D., von Rhein, D., van Hulzen, K. J. E., … Franke, B. (2018). Similar subgroups based on cognitive performance parse heterogeneity in adults with ADHD and healthy controls. Journal of Attention Disorders, 22(3), 281292.
Orben, A., & Przybylski, A. K. (2019). The association between adolescent well-being and digital technology use. Nature Human Behaviour, 3, 173182.
Ozomaro, U., Wahlestedt, C., & Nemeroff, C. B. (2013). Personalized medicine in psychiatry: Problems and promises. BMC Medicine, 11(1), 132.
Penninx, B. W. J. H., Beekman, A. T. F., Smit, J. H., Zitman, F. G., Nolen, W. A., Spinhoven, P., … Assendelft, W. J. J. (2008). The Netherlands Study of Depression and Anxiety (NESDA): Rationale, objectives and methods. International Journal of Methods in Psychiatric Research, 17(3), 121140.
Picardi, A., Viroli, C., Tarsitani, L., Miglio, R., de Girolamo, G., Dell'Acqua, G., & Biondi, M. (2012). Heterogeneity and symptom structure of schizophrenia. Psychiatry Research, 198(3), 386394. .
Powell, T. R., McGuffin, P., D'Souza, U. M., Cohen-Woods, S., Hosang, G. M., Martin, C., … Schalkwyk, L. C. (2014). Putative transcriptomic biomarkers in the inflammatory cytokine pathway differentiate major depressive disorder patients from control subjects and bipolar disorder patients. PLoS ONE, 9(3), e91076.
Reser, M. P., Allott, K. A., Killackey, E., Farhall, J., & Cotton, S. M. (2015). Exploring cognitive heterogeneity in first-episode psychosis: What cluster analysis can reveal. Psychiatry Research, 229(3), 819827.
Rohrer, J. M., Egloff, B., & Schmukle, S. C. (2017). Probing birth-order effects on narrow traits using specification-curve analysis. Psychological Science, 28(12), 18211832.
Saraçli, S., Doǧan, N., & Doǧan, I. (2013). Comparison of hierarchical cluster analysis methods by cophenetic correlation. Journal of Inequalities and Applications, 2013, 203.
Schnack, H. G. (2017). Improving individual predictions: Machine learning approaches for detecting and attacking heterogeneity in schizophrenia (and other psychiatric diseases). Schizophrenia Research, 214, 3442.
Silberzahn, R., Uhlmann, E., Martin, D., Anselmi, P., Aust, F., Awtrey, E., … Nosek, B., & A., (2018). Many analysts, one dataset: Making transparent how variations in analytical choices affect results. Advances in Methods and Practices in Psychological Science, 1(3), 337356.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 13591366.
Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve: Descriptive and inferential statistics on all reasonable specifications. Nature Human Behavior.
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11(5), 702712. .
Van Loo, H. M., De Jonge, P., Romeijn, J.-W., Kessler, R. C., & Schoevers, R. A. (2012). Data-driven subtypes of major depressive disorder: A systematic review. BMC Medicine, 10(1), 156.
van Loo, H. M., Wanders, R. B. K., Wardenaar, K. J., & Fried, E. I. (2016). Problems with latent class analysis to detect data-driven subtypes of depression. Molecular Psychiatry, 23, 495496.
Volavka, J., & Citrome, L. (2009). Oral antipsychotics for the treatment of schizophrenia: Heterogeneity in efficacy and tolerability should drive decision-making. Expert Opinion on Pharmacotherapy, 10(12), 19171928.
Wanders, R. B. K., van Loo, H. M., Vermunt, J. K., Meijer, R. R., Hartman, C. A., Schoevers, R. A., … de Jonge, P. (2016). Casting wider nets for anxiety and depression: Disability-driven cross-diagnostic subtypes in a large cohort. Psychological Medicine, 46(16), 33713382.
Wardenaar, K. J., Wanders, R. B. K., ten Have, M., de Graaf, R., & de Jonge, P. (2017). Using a hybrid subtyping model to capture patterns and dimensionality of depressive and anxiety symptomatology in the general population. Journal of Affective Disorders, 215, 125134. Retrieved from
Wolfers, T., Buitelaar, J. K., Beckmann, C. F., Franke, B., & Marquand, A. F. (2015). From estimating activation locality to predicting disorder: A review of pattern recognition for neuroimaging-based psychiatric diagnostics. Neuroscience and Biobehavioral Reviews, 57, 328349.


Type Description Title
Supplementary materials

Beijers et al. supplementary material
Beijers et al. supplementary material

 PDF (1.0 MB)
1.0 MB

Investigating data-driven biological subtypes of sychiatric disorders using specification-curve analysis

  • Lian Beijers (a1), Hanna M. van Loo (a1), Jan-Willem Romeijn (a2), Femke Lamers (a3), Robert A. Schoevers (a1) (a4) and Klaas J. Wardenaar (a1)...


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.