Hostname: page-component-848d4c4894-r5zm4 Total loading time: 0 Render date: 2024-06-18T18:47:53.339Z Has data issue: false hasContentIssue false

Association between impaired glucose metabolism and long-term prognosis at the time of diagnosis of depression: Impaired glucose metabolism as a promising biomarker proposed through a machine-learning approach

Published online by Cambridge University Press:  03 February 2023

Dong Yun Lee
Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
Yong Hyuk Cho
Department of Psychiatry, Ajou University School of Medicine, Suwon, Korea Department of Medical Sciences, Graduate School of Ajou University, Suwon, South Korea
Myoungsuk Kim
Data Science Team, Evidnet Co Ltd, Pangyo, Korea
Chang-Won Jeong
Medical Convergence Research Center, Wonkwang University, Iksan, Korea
Jae Myung Cha
Department of Gastroenterology, Gang Dong Kyung Hee University Hospital, Seoul, Korea
Geun Hui Won
Department of Psychiatry, Catholic University of Daegu School of Medicine, Daegu, Korea
Jai Sung Noh
Department of Psychiatry, Ajou University School of Medicine, Suwon, Korea
Sang Joon Son*
Department of Psychiatry, Ajou University School of Medicine, Suwon, Korea
Rae Woong Park*
Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Korea
*Authors for correspondence: Sang Joon Son, Rae Woong Park, E-mails:;
*Authors for correspondence: Sang Joon Son, Rae Woong Park, E-mails:;



Predicting the course of depression is necessary for personalized treatment. Impaired glucose metabolism (IGM) was introduced as a promising depression biomarker, but no consensus was made. This study aimed to predict IGM at the time of depression diagnosis and examine the relationship between long-term prognosis and predicted results.


Clinical data were extracted from four electronic health records in South Korea. The study population included patients with depression, and the outcome was IGM within 1 year. One database was used to develop the model using three algorithms. External validation was performed using the best algorithm across the three databases. The area under the curve (AUC) was calculated to determine the model’s performance. Kaplan–Meier and Cox survival analyses of the risk of hospitalization for depression as the long-term outcome were performed. A meta-analysis of the long-term outcome was performed across the four databases.


A prediction model was developed using the data of 3,668 people, with an AUC of 0.781 with least absolute shrinkage and selection operator (LASSO) logistic regression. In the external validation, the AUCs were 0.643, 0.610, and 0.515. Through the predicted results, survival analysis and meta-analysis were performed; the hazard ratios of risk of hospitalization for depression in patients predicted to have IGM was 1.20 (95% confidence interval [CI] 1.02–1.41, p = 0.027) at a 3-year follow-up.


We developed prediction models for IGM occurrence within a year. The predicted results were related to the long-term prognosis of depression, presenting as a promising IGM biomarker related to the prognosis of depression.

Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (, which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
© The Author(s), 2023. Published by Cambridge University Press on behalf of the European Psychiatric Association


Depression severely restricts individual psychosocial functions and lowers the quality of life. It leads to national problems such as increased suicide rates and medical expenses because of its chronicity. The World Health Organization cited major depressive disorder as the third cause of the global burden of disease in 2008 and predicted that depression would rank first by 2030 [1]. Numerous factors such as biological markers and poor habits [Reference Lee, Park, Ryoo, Oh, Mansur, Alfonsi, Cha, Lee, McIntyre and Jung2] are linked to the onset and recovery of depression. Variable clinical patterns, unpredictable progression and prognosis, and insufficient therapeutic response make depression treatment challenging for clinicians. Remission rates with antidepressants are also overall low (~27% as per the STAR*D trial) [Reference Pigott3], and 20%–25% of patients with depression are at risk of chronic depression [Reference Penninx, Nolen, Lamers, Zitman, Smit, Spinhoven, Cuijpers, de Jong, van Marwijk, van der Meer and Verhaak4]. Thus, previous studies have tried to improve treatment outcomes of depression, and evidence has revealed that early intervention of depression is not only associated with better treatment response and long-term outcomes but also with slow disease progression [Reference Ghio, Gotelli, Marcenaro, Amore and Natta5Reference Kupfer, Frank and Perel8]. These studies gradually focused on exploring various variables that can predict prognosis in the early stages of depression and achieving personalized treatment through targeted treatment strategies [Reference Liu, Li, Ma, Zhang, Sun, Luo and Zhang9Reference Ju, Horien, Chen, Guo, Lu, Sun, Dong, Liu, Liu, Yan and Wang11].

Previous studies have suggested that measuring metabolic markers may be a promising way of predicting long-term clinical outcomes in depression [Reference Vogelzangs, Beekman, Boelhouwer, Bandinelli, Milaneschi, Ferrucci and Penninx12]. Recent studies have revealed the relationship among depression, suicidal behavior, insulin resistance (IR), or impaired glucose metabolism (IGM), and evidence of their interactions is accumulating [Reference Watson, Simard, Henderson, Nutkiewicz, Lamers, Rasgon and Penninx13Reference Bot, Pouwer, De Jonge, Tack, Geelhoed‐Duijvestijn and Snoek16]. Specifically, IGM is characterized by glucose metabolic disturbance and is defined as prediabetes mellitus (DM) and DM [Reference King, Walker, Levy, Bottomley, Royston, Weich, Bellon-Saameno, Moreno, Švab, Rotar and Rifel17]. It can be measured based on hemoglobin A1C (HbA1c) levels in the blood and fasting blood sugar; thus, its clinical utility is high. Many studies have reported that IGM has a bidirectional association with depression. A study reported that reduced serotonin levels were associated with elevated blood glucose levels, Insulin Resistance (IR), and depressed mood [Reference Wang, Patten, Sareen, Bolton, Schmitz and MacQueen18]. Some previous studies have found that higher glucose levels are associated with dysthymia and higher HbA1c concentrations with recurrent or psychotic depression [Reference Gold, Köhler-Forsberg, Moss-Morris, Mehnert, Miranda, Bullinger, Steptoe, Whooley and Otte19]. In addition, a study in adults with type 2 DM (T2DM) found that certain antidiabetic drugs were associated with a lower risk of depression [Reference Wium-Andersen, Osler, Jørgensen, Rungby and Wium-Andersen20]. Recently, a cross-sectional study correlated IR with depression severity as an endophenotype of depression [Reference Watson, Simard, Henderson, Nutkiewicz, Lamers, Rasgon and Penninx13]. Despite these studies, no consensus has yet been reached to the extent that the association between IGM and depression is applicable to clinical practice for establishing patient care strategies.

Machine-learning (ML)-based predictive models are becoming increasingly popular by combining huge data into one model. For depression, conventional regression methods have limitations in prediction; not only well-known demographic factors or factors related to typical treatment but also various comorbidity with physical disease and generally polypharmacy are common [Reference Gold, Köhler-Forsberg, Moss-Morris, Mehnert, Miranda, Bullinger, Steptoe, Whooley and Otte19, Reference Correll, Detraux, De Lepeleire and De Hert21]. By contrast, ML-based methods have successfully predicted depression persistence, chronicity, severity [Reference Kessler, van Loo, Wardenaar, Bossarte, Brenner, Cai, Ebert, Hwang, Li, de Jonge and Nierenberg22], treatment response, and first and new onset of depressive episodes [Reference King, Walker, Levy, Bottomley, Royston, Weich, Bellon-Saameno, Moreno, Švab, Rotar and Rifel17, Reference Wang, Patten, Sareen, Bolton, Schmitz and MacQueen18, Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi, Cannon, Krystal and Corlett23].

This study aimed to investigate whether IGM could be utilized as a biomarker that reflects the clinical severity and prognosis of depression. Initially, we attempted to develop a model that predicts IGM occurrence at the time of the first diagnosis of depression through an ML algorithm. Subsequently, using multicenter and longitudinal data, we intended to analyze and validate whether the IGM occurrence predicted by the model is related to the short-term and long-term prognosis of depression.


Data source

This study used data from approximately 6 million patients across the four electronic health record databases in South Korea: Ajou University School of Medicine (AUSOM), Daegu Catholic Medical Center (DCMC), Wonkwang University Hospital (WKUH), and Kyung Hee University Hospital at Gangdong (KHNMC) (Supplementary Material S1). The clinical data included diagnoses, observations, provider visits, procedures performed, and medications filled. The databases were formatted according to the Observational Medical Outcomes Partnership–Common Data Model version 5.3.1, maintained by the Observational Health Data Sciences and Informatics (OHDSI), and de-identified [Reference Makadia and Ryan24]. The database of AUSOM was used in model development, and the other three databases were used to validate the developed model. After the development and validation of the model, all databases were used in the survival analysis.

This study was approved by the Institutional Review Board of the Ajou University Hospital (AJOUIRB-MDB-2022-255). Informed consent was not required owing to the use of de-identified data. Access to DCMC, WKUH, and KHNMC databases during the external validation process was allowed under the IRB mutual recognition agreement (research-free zone agreement).

Study population and outcome

The study population included patients with a new depressive episode. The index date was defined as the patient’s first diagnosis of depressive disorder. To verify their first diagnosis of depressive disorder, at least 1 year of observation before the index date was required. Within the 1-year observation period before the index date, relevant covariates on each patient were collected to predict their future diagnosis of IGM. Patients who were treated for depression, those who had antidepressant prescriptions, and had undergone psychiatric procedures after the index date were included. Also, patients who had at least 1 year of follow-up after the index date were included. For the IGM prediction, patients who had at least one measure of HbA1c or fasting glucose within 1 year after the index date were included. As exclusion criteria, patients with diagnosis of bipolar disorder, schizophrenia, and psychosis on or before the index date were excluded. Regarding DM, a previous history of DM, DM complications, and exposures to antidiabetic drugs were excluded.

The primary outcome for the predictive models was IGM within 1 year after the index date. IGM was defined as pre-DM or T2DM and measured by HbA1c or fasting glucose. For IGM, HbA1c levels were defined as ≥5.6%, and fasting plasma glucose as ≥100 mg/dL [Reference Mansur, Rizzo, Santos, Asevedo, Cunha, Noto, Pedrini, Zeni, Cordeiro, McIntyre and Brietzke25]. All patients with depression were followed up for 1 year. If IGM occurred within this 1-year period, the observation was stopped on the day that the IGM diagnosis was coded. Thus, the predictive models were developed using the primary outcome. After that, patients were divided into “predicted to have IGM” and “predicted not to have IGM” groups through a predictive model at the time of the index date. Further details of the cohort definitions and code lists are presented in Supplementary Materials S2–S3.

Model development

We used the patient-level prediction framework of the OHDSI to develop and validate the predictive models. This framework consisted of standardized model development and validation processes that require defining predictable problems and selecting the study population, outcomes, population settings, predictors, and statistical algorithms [Reference Reps, Schuemie, Suchard, Ryan and Rijnbeek26]. The predictive variables for model training were extracted and dichotomized for existence within short-term (30 days) and long-term (365 days) intervals before the index. The variables included patient age, sex, month of the index visit, diagnoses, drug exposures, and procedures. Through this process, 22,904 candidate variables were generated. The models were developed across multiple algorithms, including least absolute shrinkage and selection operator (LASSO)-penalized regression, random forest, and extreme gradient boosting (XGBoost) via threefold cross-validation. The algorithm with the best performance was selected for the final model according to the value of the area under the receiver operating characteristic curve (AUROC).

External validation

External validation was conducted to confirm the validity of the model’s performance using the databases of DCMC, WKUH, and KHNMC. Specifically, we evaluated the performance of the final model to other databases in the same setting as in the model development.

Follow-up and long-term outcome measurements

The patients were followed up 3 years after the index date. During the follow-up, risk of hospitalization for depression in patients who were predicted to have IGM compared with patients who were predicted not to have IGM. Risk of hospitalization for depression was defined as hospitalization caused by the exacerbation of depressive episodes. In addition, rehospitalization after discharge for the first diagnosis was considered [Reference Moncrieff, Crellin, Long, Cooper and Stockmann27]. To distinguish between existing hospitalization and rehospitalization, only hospitalization after at least a 2-week washout period was defined as an outcome. The outcomes were binarized into hospitalization and non-hospitalization based on the occurrences recorded in the databases.

Statistical analysis

Descriptive statistical analyses were appropriately performed. Baseline characteristics are presented as counts with proportions for categorical variables and as median with interquartile range for continuous variables. The chi-square test was used to compare categorical variables between populations. Accuracy, AUROC, and area under the precision and recall curve (AUPRC) were calculated to evaluate the performance of the prediction models. We used the maximal Youden index to select the optimal cutoff value in the prediction model [Reference Fluss, Faraggi and Reiser28].

Moreover, we verified whether the group predicted by the final model was related to the actual IGM occurrence. The final model was used to estimate their predicted IGM at the internal validation dataset, and patients with a relatively high probability of IGM were then labeled as predicted to have IGM. If the patients in the internal validation dataset were predicted to have IGM, they were classified as “predicted to have IGM,” and others were classified as “predicted not to have IGM.” The Kaplan–Meier survival analysis and log-rank test were used to analyze the difference in the occurrence of IGM within 1 year after the index date in the group predicted to have IGM versus the group predicted not to have IGM.

After model development and external validation, Kaplan–Meier and Cox survival analyses for the long-term outcomes were performed to assess the risk of hospitalization for depression in patients who have IGM, as determined by the final model. Then, a meta-analysis was performed to calculate the summary hazard ratio (HR) estimates across four databases.

All p-values <0.05 were considered statistically significant. All analyses were conducted using R software version 3.6 (R Foundation for Statistical Computing, Vienna, Austria), OHDSI’s Health Analytics Data to Evidence Suite packages, and open-source statistical R packages.


Baseline characteristics

A total of 481 outcomes in 3,668 patients from AUSOM were used for model development, and for the external validation, 543 outcomes in a total of 5,716 patients (DCMC, n = 2,129; WKUH, n = 2,717; and KHNMC, n = 870) were used. Table 1 shows the baseline characteristics of the study population in AUSOM. The baseline characteristics of other databases are presented in Supplementary Tables S1–S3. Among the 3,668 patients with depression in the AUSOM database, 481 (13.1%) experienced IGM within 1 year after the diagnosis of depression. No significant differences were found in age, sex, medical history except hypertension, and psychiatric history between the groups. The proportion of hypertension was significantly lower in with IGM group (p < 0.01). Middle-aged (40–59 years) and female patients were the most predominant in the study population. Hypertension and anxiety disorder were frequent diagnoses (hypertension, 15.2% and 9.1%; anxiety disorder, 15.4% and 14.1%, respectively).

Table 1. Baseline characteristics for study population with or without IGM in AUSOM.

Note: $ {\chi}_{(df)}^2 $ , chi-square value and degree of freedom.

Abbreviations: AUSOM, Ajou university school of medicine; IGM, impaired glucose metabolism.

* indicates statistical significance (p < 0.05).

Prediction models

Figure 1 shows the performance of the ML model in the internal validation set of AUSOM, including LASSO, random forest, and XGBoost. The best-performing model, selected by comparing the average AUROC from the threefold validation, was a logistic regression with LASSO. We defined LASSO as the final model, which showed an AUROC of 0.781 (95% CI 0.742–0.820) on the internal validation dataset. The accuracy and AUPRC of the final model were 0.667 and 0.338, respectively. The performance metrics are shown in Supplementary Table S4.

Figure 1. Receiver operating characteristic (ROC) curve of models predicting impaired glucose metabolism. (A) ROC curve for the models according to algorithms. (B) ROC curve for internal and external validations. The performance of the models using the area under the receiver operating characteristic curve is compared.

Table 2 shows the top 10 important predictors. The feature importance analysis showed that a normal range of blood glucose levels before depression diagnosis was the most important predictor across the three algorithms. Drug exposures such as antipsychotics were important predictors in the prediction models. Three models consistently considered the category of the blood test as important predictors. Unlike other models, the LR with LASSO model included the category of image test as a predictor. In Supplementary Table S5, the predictors that increase the IGM prediction risk and those that decrease the risk are indicated in red and blue, respectively.

Table 2. Top 10 important predictors of the prediction models for impaired glucose metabolism.

Note: The color in the table means the category of features (orange: laboratory test, green: image test, and blue: drug exposure).

Abbreviations: CT, computed tomography; LASSO, least absolute shrinkage and selection operator; LR, logistic regression; NSAID, non-steroidal anti-inflammatory drugs; SSRI, selective serotonin reuptake inhibitor; XGBoost, extreme gradient boosting.

External model validation

The final model was externally validated using the DCMC, WKUH, and KHNMC databases. In the external validation databases, patients experienced IGM at a rate of 8.8% (188/2,129) in DCMC, 12.0% (327/2,717) in WKUH, and 3.2% (28/870) in KHNMC. The external validation performance of the final model regarding AUROC was 0.643 at DCMC, 0.610 at WKUH, and 0.515 at KHNMC.

Long-term outcomes of ML-predicted IGM

Figure 2 shows the clinical benefit of using the IGM prediction models. In the internal validation dataset of AUSOM, the group predicted to have IGM had a significantly higher occurrence of IGM within 1 year after the index date than the group predicted not to have IGM (log rank, p < 0.001). Furthermore, patients predicted to have IGM showed significantly worse long-term outcomes. In the overall cohort of AUSOM, survival analysis showed that the risk of hospitalization for depression occurred more frequently in patients who were predicted to have IGM during the 3-year follow-up (log rank, p = 0.002) (Figure 2).

Figure 2. Kaplan–Meier curves in the stratified survival analysis. (A) Impaired glucose metabolism in the internal validation dataset of AUSOM. (B) Long-term outcome for the 3-year follow-up in the overall cohort of AUSOM.

We further assessed long-term outcomes not only in AUSOM but also in external validation databases. The meta-analytic comparative effect estimates for the risk of hospitalization for depression are presented in Figure 3. The summary HR of risk of hospitalization for depression during the 3-year follow-up was 1.20 (95% CI 1.02–1.41, p = 0.027) for patients predicted to have IGM.

Figure 3. Risk of long-term outcome in 3 years in patients predicted by the machine-learning model to have IGM within 1 year.


We constructed a model to predict the occurrence of IGM within 1 year at the time of depression diagnosis using ML algorithms. By analyzing the longitudinal data of multiple institutions using this prediction model, we identified relationships between IGM prediction and the long-term prognosis of depression. Thus, IGM might be a promising biomarker associated with the prognosis of depression.

Despite being a common psychiatric disease, depression has a low treatment success rate because of the heterogeneity and difficulty in predicting its course [Reference Pigott3, Reference Kennedy and Giacobbe29]. Thus, clinicians desire to identify biomarkers that can reflect the severity or chronicity of depression. Several previous studies have shown a complex relationship between depression and IGM. Knol et al. [Reference Knol, Twisk, Beekman, Heine, Snoek and Pouwer30] reported in a meta-analysis a 37% increased risk of T2DM development in adults with depression compared with individuals without depression. Several possibilities have been suggested, and there are reports that hypothalamic–pituitary–adrenal axis abnormalities in patients with depression, hypercortisolemia, and immune system abnormalities, including chronic low-grade inflammations, influence the insulin effect [Reference Ghio, Gotelli, Marcenaro, Amore and Natta5, Reference Habert, Katzman, Oluboka, McIntyre, McIntosh, MacQueen, Khullar, Milev, Kjernisted and Chokka6, Reference Wang, Patten, Sareen, Bolton, Schmitz and MacQueen18]. Conversely, IGM including DM is related to the development or exacerbation of depression and the reactivity of antidepressants [Reference Gonzalez, Peyrot, McCarl, Collins, Serpa, Mimiaga and Safren31, Reference Bogner, Morales, de Vries and Cappola32]. The dysfunction of insulin receptors and subsequent signal cascade, which are related to IGM, has a direct effect on neural metabolism and the brain and is associated with depression by causing abnormalities in neurotransmitter metabolisms such as dopamine, serotonin, and norepinephrine [Reference Boucher, Kleinridders and Kahn33, Reference Leonard and Wegener34]. Moreover, some studies have revealed that the successful treatment of depression can correct insulin response, particularly with more serotonergic agents, such as selective serotonin reuptake inhibitors (SSRIs) [Reference Silva, Atlantis and Ismail35, Reference Holt, De Groot and Golden36]. However, a recent study reported that low doses of metformin, DPP4 inhibitors, GLP1 analogs, and especially SGLT2 inhibitors were associated with lower odds of depression than non-users of these medications [Reference Wium-Andersen, Osler, Jørgensen, Rungby and Wium-Andersen20]. In summary, bidirectional pathophysiological connections exist between depression and IGM. This connection means that depression and IGM are important factors not only in each other’s pathogenesis but also in each other’s successful treatment and prognosis.

Recently, a nationwide study revealed that glucose disturbance is associated with increased suicidal ideation and suicidal behavior in patients with depression [Reference Chen, Li, Lang, Li and Zhang37]. In another large-scale study, IR was proposed as a promising marker that reflects severity and chronicity in patients with depression [Reference Watson, Simard, Henderson, Nutkiewicz, Lamers, Rasgon and Penninx13]. These large-scale cross-sectional studies opened with a prelude to the relationship between IR and depression. Consequently, clinicians are paying attention to predicting IGM including IR in the early stages of diagnosis, and various treatment strategies can be implemented considering the long-term prognosis and treatment reactivity of patients with depression. However, depression and IGM have a complex relationship, and predicting IGM in the early stages of depression is not easy; thus, analysis using large-scale variables is needed because conventional analysis has limitations.

Data-driven ML algorithms are in the spotlight as a breakthrough in the discovery of hidden predictors and known clinically meaningful predictors selected by researchers [Reference Dinga, Marquand, Veltman, Beekman, Schoevers, van Hemert, Penninx and Schmaal10]. Therefore, in this study, an IGM prediction model was developed using a data-driven ML algorithm. Specifically, the data used in this study consisted of a large number of tabular data, which was advantageous for the use of ML algorithms such as XGBoost, LR with LASSO, and random forest, similar to previous studies [Reference Sharma and Verbeke38]. The model using LR with LASSO showed the highest performance in this study (Figure 1).

Since this study developed an IGM prediction model through an ML algorithm rather than deep learning, understandable explanations for prediction were obtained. Initially, at the time of diagnosis of depression, antipsychotics, including haloperidol, are commonly prescribed. Moreover, studies have reported that antipsychotics are related to an increase in blood sugar [Reference Holt39]. In addition, several studies have reported that benzodiazepine [Reference Zumoff and Hellman40], corticosteroid [Reference Hwang and Weiss41], and peptic ulcer prescription, which are expected to be proton pump inhibitors [Reference Czarniak, Ahmadizar, Hughes, Parsons, Kavousi, Ikram and Stricker42], are associated with increased blood sugar and DM. CT, C reactive protein measurement, hepatitis B virus test, uric acid measurement, and nonsteroidal anti-inflammatory drug prescriptions are also observed as important predictors. They may have been tested for certain symptoms or prescribed drugs as an extension of the immune system’s dysfunction observed in depression and IGM. Ken et al. revealed that chronic low-grade inflammatory reactions in depression lead to apoptosis in pancreatic beta cells, which is related to IGM [Reference Kan, Silva, Golden, Rajala, Timonen, Stahl and Ismail43]. The increase in cytokine levels in patients with depression is linked to metabolic disturbance [Reference Leonard and Wegener34]. We suggest the immunological vulnerability in patients at the time of diagnosis of depression was reflected as a predictor. On the contrary, the most important predictor is “normal range of blood glucose level within 1 year before diagnosis” in all algorithms. Thus, if the blood glucose level within 1 year at the time of diagnosis of depression was normal, this time is not enough to observe the progression to IGM within 1 year. Measuring the levels of cholesterol, triglyceride, urinalysis, creatinine, etc., through blood tests also had a negative relationship with predicting IGM. This can be interpreted in the same way as the results of previous studies that continuity of care had some benefits including prevention of chronic diseases including DM [Reference Koopman, Mainous, Baker, Gill and Gilbert44]. Finally, unlike other drugs, SSRIs showed a negative relationship with predicting IGM. Although this is controversial, SSRIs had a positive effect on blood sugar control among antidepressants [Reference Tharmaraja, Stahl, Hopkins, Persaud, Jones, Ismail and Moulton45, Reference Deuschle46].

Furthermore, in this study, survival analysis was conducted to determine whether the results of the IGM incidence prediction model were related to the 3-year prognoses of depression. Through a meta-analysis using four other hospital data with the Common Data Model (CDM) database, we identified differences in hospitalization caused by exacerbation of depressive episodes between the two groups divided into IGM prediction models using longitudinal data. This result can be interpreted through the report of previous studies that when depression or anxiety is accompanied by DM, disease burden and emotional distress increase because of poor metabolic control, low rates of blood glucose self-monitoring, and DM complications, which can predict inadequate response to depression treatment [Reference Holt, De Groot and Golden36, Reference Ducat, Philipson and Anderson47].

This study has several limitations. First, this study used data from Koreans only; thus, the results cannot be generalized. However, this study showed that the CDM developed through a distributed research network enables a more efficient meta-analysis than in the past without exposing private information. This suggests that a global meta-analysis is possible if the same CDM is established in various countries. Second, this study used longitudinal data, but it has the limitations of a retrospective study. To clarify the relationship between depression and IGM, a prospective study is required. Third, this study did not include social and environmental factors that would be related to depression and IGM in the model development. This is also a limitation of the psychiatric CDM. Thus, developing measurable environmental and sociological variables is necessary. Fourth, model performance was reduced in the external validation of the IGM prediction model. Model performance commonly decreases in the external validation because of the varying characteristics of the enrolled participants, and it is difficult to control them all. Specifically, the external validation performance was low in the analysis of KHNMC compared with AUSOM. The result was assumed to be caused by the varying rates of IGM occurrence, i.e., 13.1% in AUSOM and 3.2% in KHNMC. Furthermore, since there is no overall difference between patients with and without IGM in the baseline characteristics, predicting IGM may be difficult. Fifth, indirect indicators such as depression-related hospitalization were used to determine the relationship between the results of the IGM prediction model and the long-term prognosis of depression. However, several recent studies have derived meaningful results using operational definitions such as this study [Reference Moncrieff, Crellin, Long, Cooper and Stockmann27]. Sixth, we included only individuals who were assessed for IGM for the study population. Among patients with depression in the study database, those with IGM assessment had a higher rate of comorbidities than those without IGM assessment. This suggests that the generalization of the results should be cautious.

In summary, we developed an IGM prediction model at the time of depression diagnosis using an ML algorithm and found a relationship between the results of the IGM prediction model and the long-term prognosis of depression using longitudinal data. Thus, we suggest that IGM is likely to be a promising biomarker in predicting the prognosis of depression. Treatment strategies should be established to improve metabolic disturbance, including IGM, and the use of IGM as an evaluation index for lifestyle modification and increased treatment success rate may be expected. Therefore, a more customized and multidimensional approach to the evaluation and treatment of depression would be possible.

Supplementary Materials

To view supplementary material for this article, please visit

Data Availability Statement

CDM data are designed to support a distributed research network. Thus, access to the data is restricted on internal private networks. Therefore, data are not publicly available.


The authors thank the participants and investigators for this study. The views expressed are those of the authors and not necessarily those of EvidNet.

Author Contributions

Conceptualization: D.Y.L., Y.H.C.; Data curation: D.Y.L., M.K.; Formal analysis: D.Y.L., M.K.; Funding acquisition: D.Y.L., R.W.P., S.J.S.; Methodology: D.Y.L,; Project administration: D.Y.L.; Resources: C.-W.J., J.M.C., G.H.W., J.S.N., S.J.S., R.W.P.; Writing—original draft: D.Y.L., Y.H.C.; Writing—review and editing: S.J.S., R.W.P.

Financial Support

This research was funded by the Bio Industrial Strategic Technology Development Program (20003883, 20005021) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea) and a grant from the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI) funded by the Ministry of Health &Welfare, Republic of Korea (Grant Number: HR16C0001, HR22C173401). And Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education [NRF-2020R1I1A1A01072208, 2019R1A5A2026045].

Conflicts of Interest

The authors declare none.

Ethical Standards

The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.


D.Y.L. and Y.H.C. authors are contributed equally to this work and share first authorship. S.J.S. and R.W.P. contributed equally and should be considered co-corresponding authors.


World Health Organization. The global burden of disease: 2004 update Geneva: World Health Organization; 2008.Google Scholar
Lee, JH, Park, SK, Ryoo, JH, Oh, CM, Mansur, RB, Alfonsi, JE, Cha, DS, Lee, Y, McIntyre, RS, Jung, JY. The association between insulin resistance and depression in the Korean general population. J Affect Disord. 2017;208:553–9.CrossRefGoogle ScholarPubMed
Pigott, HE. The STAR* D trial: it is time to reexamine the clinical beliefs that guide the treatment of major depression. Can J Psychiatry. 2015;60(1):913.CrossRefGoogle Scholar
Penninx, BW, Nolen, WA, Lamers, F, Zitman, FG, Smit, JH, Spinhoven, P, Cuijpers, P, de Jong, PJ, van Marwijk, HW, van der Meer, K, Verhaak, P. Two-year course of depressive and anxiety disorders: results from the Netherlands study of depression and anxiety (NESDA). J Affect Disord. 2011;133(1-2):7685.CrossRefGoogle ScholarPubMed
Ghio, L, Gotelli, S, Marcenaro, M, Amore, M, Natta, W. Duration of untreated illness and outcomes in unipolar depression: a systematic review and meta-analysis. J Affect Disord. 2014;152:4551.CrossRefGoogle ScholarPubMed
Habert, J, Katzman, MA, Oluboka, OJ, McIntyre, RS, McIntosh, D, MacQueen, GM, Khullar, A, Milev, RV, Kjernisted, KD, Chokka, PR. Functional recovery in major depressive disorder: focus on early optimized treatment. Prim Care Companion CNS Disord. 2016;18(5):24746.Google ScholarPubMed
Kautzky, A, Dold, M, Bartova, L, Spies, M, Kranz, GS, Souery, D, Montgomery, S, Mendlewicz, J, Zohar, J, Fabbri, C, Serretti, A. Clinical factors predicting treatment resistant depression: affirmative results from the European multicenter study. Acta Psychiatr Scand. 2019;139(1):7888.CrossRefGoogle ScholarPubMed
Kupfer, DJ, Frank, E, Perel, JM. The advantage of early treatment intervention in recurrent depression. Arch Gen Psychiatry. 1989;46:771–5.Google Scholar
Liu, X, Li, P, Ma, X, Zhang, J, Sun, X, Luo, X, Zhang, Y. Association between plasma levels of BDNF and GDNF and the diagnosis, treatment response in first-episode MDD. J Affect Disord. 2022;315:190–7.CrossRefGoogle ScholarPubMed
Dinga, R, Marquand, AF, Veltman, DJ, Beekman, AT, Schoevers, RA, van Hemert, AM, Penninx, BW, Schmaal, L. Predicting the naturalistic course of depression from a wide range of clinical, psychological, and biological data: a machine learning approach. Transl Psychiatry. 2018;8(1):111.CrossRefGoogle Scholar
Ju, Y, Horien, C, Chen, W, Guo, W, Lu, X, Sun, J, Dong, Q, Liu, B, Liu, J, Yan, D, Wang, M. Connectome-based models can predict early symptom improvement in major depressive disorder. J Affect Disord. 2020;273:442–52.CrossRefGoogle Scholar
Vogelzangs, N, Beekman, AT, Boelhouwer, IG, Bandinelli, S, Milaneschi, Y, Ferrucci, L, Penninx, BW Metabolic depression: a chronic depressive subtype? Findings from the InCHIANTI study of older persons. J Clin Psychiatry. 2011;72(5):11748.CrossRefGoogle Scholar
Watson, KT, Simard, JF, Henderson, VW, Nutkiewicz, L, Lamers, F, Rasgon, N, Penninx, B. Association of insulin resistance with depression severity and remission status: defining a metabolic endophenotype of depression. JAMA Psychiat. 2021;78(4):439–41.CrossRefGoogle ScholarPubMed
Ceretta, LB, Réus, GZ, Abelaira, HM, Jornada, LK, Schwalm, MT, Hoepers, NJ, Tomazzi, CD, Gulbis, KG, Ceretta, RA, Quevedo, J. Increased prevalence of mood disorders and suicidal ideation in type 2 diabetic patients. Acta Diabetol. 2012;49(1):227–34.CrossRefGoogle Scholar
Han, SJ, Kim, HJ, Choi, YJ, Lee, KW, Kim, DJ. Increased risk of suicidal ideation in Korean adults with both diabetes and depression. Diabetes Res Clin Pract. 2013;101(3):e14–7.CrossRefGoogle ScholarPubMed
Bot, M, Pouwer, F, De Jonge, P, Tack, CJ, Geelhoed‐Duijvestijn, PH, Snoek, FJ. Differential associations between depressive symptoms and glycaemic control in outpatients with diabetes. Diabet Med. 2013;30(3):e115–22.CrossRefGoogle Scholar
King, M, Walker, C, Levy, G, Bottomley, C, Royston, P, Weich, S, Bellon-Saameno, JA, Moreno, B, Švab, I, Rotar, D, Rifel, J. Development and validation of an international risk prediction algorithm for episodes of major depression in general practice attendees: the PredictD study. Arch Gen Psychiatry. 2008;65(12):1368–76.CrossRefGoogle ScholarPubMed
Wang, JL, Patten, S, Sareen, J, Bolton, J, Schmitz, N, MacQueen, G. Development and validation of a prediction algorithm for use by health professionals in prediction of recurrence of major depression. Depress Anxiety. 2014;31(5):451–7.CrossRefGoogle Scholar
Gold, SM, Köhler-Forsberg, O, Moss-Morris, R, Mehnert, A, Miranda, JJ, Bullinger, M, Steptoe, A, Whooley, MA, Otte, C. Comorbid depression in medical diseases. Nat Rev Dis Primers. 2020;6(1):122.CrossRefGoogle ScholarPubMed
Wium-Andersen, IK, Osler, M, Jørgensen, MB, Rungby, J, Wium-Andersen, MK. Diabetes, antidiabetic medications and risk of depression–a population-based cohort and nested case-control study. Psychoneuroendocrinology. 2022;140:105715.CrossRefGoogle Scholar
Correll, CU, Detraux, J, De Lepeleire, J, De Hert, M. Effects of antipsychotics, antidepressants and mood stabilizers on risk for physical diseases in people with schizophrenia, depression and bipolar disorder. World Psychiatry. 2015;14(2):119–36.CrossRefGoogle Scholar
Kessler, RC, van Loo, HM, Wardenaar, KJ, Bossarte, RM, Brenner, LA, Cai, T, Ebert, DD, Hwang, I, Li, J, de Jonge, P, Nierenberg, AA. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366–71.CrossRefGoogle ScholarPubMed
Chekroud, AM, Zotti, RJ, Shehzad, Z, Gueorguieva, R, Johnson, MK, Trivedi, MH, Cannon, TD, Krystal, JH, Corlett, PR. Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry. 2016;3(3):243–50.CrossRefGoogle Scholar
Makadia, R, Ryan, PB. Transforming the premier perspective hospital database into the observational medical outcomes partnership (OMOP) common data model. EGEMS (Washington, DC). 2014;2(1):1110.CrossRefGoogle Scholar
Mansur, RB, Rizzo, LB, Santos, CM, Asevedo, E, Cunha, GR, Noto, MN, Pedrini, M, Zeni, M, Cordeiro, Q, McIntyre, RS, Brietzke, E. Impaired glucose metabolism moderates the course of illness in bipolar disorder. J Affect Disord. 2016;195:5762.CrossRefGoogle ScholarPubMed
Reps, JM, Schuemie, MJ, Suchard, MA, Ryan, PB, Rijnbeek, PR. Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J Am Med Inform Assoc. 2018;25(8):969–75.CrossRefGoogle ScholarPubMed
Moncrieff, J, Crellin, NE, Long, MA, Cooper, RE, Stockmann, T. Definitions of relapse in trials comparing antipsychotic maintenance with discontinuation or reduction for schizophrenia spectrum disorders: a systematic review. Schizophr Res. 2020;225:4754.CrossRefGoogle ScholarPubMed
Fluss, R, Faraggi, D, Reiser, B. Estimation of the Youden index and its associated cutoff point. Biom J: J Math Meth Biosci. 2005;47(4):458–72.CrossRefGoogle Scholar
Kennedy, SH, Giacobbe, P. Treatment resistant depression—advances in somatic therapies. Ann Clin Psychiatry. 2007;19(4):279–87.CrossRefGoogle ScholarPubMed
Knol, MJ, Twisk, JW, Beekman, AT, Heine, RJ, Snoek, FJ, Pouwer, F. Depression as a risk factor for the onset of type 2 diabetes mellitus. A meta-analysis. Diabetologia. 2006;49(5):837–45.CrossRefGoogle ScholarPubMed
Gonzalez, JS, Peyrot, M, McCarl, LA, Collins, EM, Serpa, L, Mimiaga, MJ, Safren, SA. Depression and diabetes treatment nonadherence: a meta-analysis. Diabetes Care. 2008;31(12):2398–403.CrossRefGoogle ScholarPubMed
Bogner, HR, Morales, KH, de Vries, HF, Cappola, AR. Integrated management of type 2 diabetes mellitus and depression treatment to improve medication adherence: a randomized controlled trial. Ann Fam Med. 2012;10(1):1522.CrossRefGoogle ScholarPubMed
Boucher, J, Kleinridders, A, Kahn, CR. Insulin receptor signaling in normal and insulin-resistant states. Cold Spring Harb Perspect Biol. 2014;6(1):a009191.CrossRefGoogle ScholarPubMed
Leonard, BE, Wegener, G. Inflammation, insulin resistance and neuroprogression in depression. Acta Neuropsychiatrica. 2020;32(1):19.CrossRefGoogle Scholar
Silva, N, Atlantis, E, Ismail, K. A review of the association between depression and insulin resistance: pitfalls of secondary analyses or a promising new approach to prevention of type 2 diabetes? Curr Psychiatry Rep. 2012;14(1):814.CrossRefGoogle ScholarPubMed
Holt, RI, De Groot, M, Golden, SH. Diabetes and depression. Curr Diab Rep. 2014;14(6):19.CrossRefGoogle ScholarPubMed
Chen, SW, Li, X, Lang, X, Li, J, Zhang, XY. Metabolic parameters and thyroid hormones in relation to suicide attempts in patients with first-episode and drug-naive major depressive disorder with comorbid glucose disturbances: a large cross-sectional study. Eur Arch Psychiatry Clin Neurosci. s00406 2022:19.Google Scholar
Sharma, A, Verbeke, WJ. Improving diagnosis of depression with XGBOOST machine learning model and a large biomarkers Dutch dataset (n = 11,081). Front Big Data. 2020;3:15.CrossRefGoogle Scholar
Holt, RI. Association between antipsychotic medication use and diabetes. Curr Diab Rep. 2019;19(10):110.CrossRefGoogle ScholarPubMed
Zumoff, B, Hellman, L. Aggravation of diabetic hyperglycemia by chlordiazepoxide. JAMA. 1977;237(18):1960–1.CrossRefGoogle ScholarPubMed
Hwang, JL, Weiss, RE. Steroid‐induced diabetes: a clinical and molecular approach to understanding and treatment. Diabetes Metab Res Rev. 2014;30(2):96102.CrossRefGoogle ScholarPubMed
Czarniak, P, Ahmadizar, F, Hughes, J, Parsons, R, Kavousi, M, Ikram, M, Stricker, BH. Proton pump inhibitors are associated with incident type 2 diabetes mellitus in a prospective population‐based cohort study. Br J Clin Pharmacol. 2022;88(6):2718–26.CrossRefGoogle Scholar
Kan, C, Silva, N, Golden, SH, Rajala, U, Timonen, M, Stahl, D, Ismail, K. A systematic review and meta-analysis of the association between depression and insulin resistance. Diabetes Care. 2013;36(2):480–9.CrossRefGoogle ScholarPubMed
Koopman, RJ, Mainous, AG III, Baker, R, Gill, JM, Gilbert, GE. Continuity of care and recognition of diabetes, hypertension, and hypercholesterolemia. Arch Intern Med. 2003;163(11):1357–61.CrossRefGoogle ScholarPubMed
Tharmaraja, T, Stahl, D, Hopkins, CW, Persaud, SJ, Jones, PM, Ismail, K, Moulton, CD. The association between selective serotonin reuptake inhibitors and glycemia: a systematic review and meta-analysis of randomized controlled trials. Psychosom Med. 2019;81(7):570–83.CrossRefGoogle ScholarPubMed
Deuschle, M. Effects of antidepressants on glucose metabolism and diabetes mellitus type 2 in adults. Curr Opin Psychiatry. 2013;26(1):60–5.CrossRefGoogle ScholarPubMed
Ducat, L, Philipson, LH, Anderson, BJ. The mental health comorbidities of diabetes. JAMA. 2014;312(7):691–2.CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Baseline characteristics for study population with or without IGM in AUSOM.

Figure 1

Figure 1. Receiver operating characteristic (ROC) curve of models predicting impaired glucose metabolism. (A) ROC curve for the models according to algorithms. (B) ROC curve for internal and external validations. The performance of the models using the area under the receiver operating characteristic curve is compared.

Figure 2

Table 2. Top 10 important predictors of the prediction models for impaired glucose metabolism.

Figure 3

Figure 2. Kaplan–Meier curves in the stratified survival analysis. (A) Impaired glucose metabolism in the internal validation dataset of AUSOM. (B) Long-term outcome for the 3-year follow-up in the overall cohort of AUSOM.

Figure 4

Figure 3. Risk of long-term outcome in 3 years in patients predicted by the machine-learning model to have IGM within 1 year.

Supplementary material: File

Lee et al. supplementary material

Lee et al. supplementary material

Download Lee et al. supplementary material(File)
File 51.6 KB
Submit a response


No Comments have been published for this article.