With rapid progress being made around the world in the identification of individuals at clinical high risk (CHR) for psychosis, there is now hope that serious psychotic disorders such as schizophrenia can be mitigated or prevented with early intervention strategies (Solis, Reference Solis2014). However, the current standard CHR criteria are largely viewed as limited, as only 30% or fewer of these CHR convert to full psychosis within 2 years (Fusar-Poli et al., Reference Fusar-Poli, Bonoldi, Yung, Borgwardt, Kempton, Valmaggia and McGuire2012). Efforts are thus underway to refine risk identification strategies to increase their predictive power. While promising, these strategies have largely focused on group research data. The penultimate goal, however, is to identify more precise risk markers to guide evidence-based, personalized treatments similar to those in other branches of medicine, such as cardiovascular and cerebrovascular diseases (Insel, Reference Insel2007; Osawa, Nakanishi, & Budoff, Reference Osawa, Nakanishi and Budoff2017; Ridker, Buring, Rifai, & Cook, Reference Ridker, Buring, Rifai and Cook2007). Work is proceeding in a number of CHR research centers to refine and to personalize risk prediction as well as to improve treatment (Cannon et al., Reference Cannon, Yu, Addington, Bearden, Cadenhead, Cornblatt and Kattan2016; Carrion et al., Reference Carrion, Cornblatt, Burton, Tso, Auther, Adelsheim and McFarlane2016; Clark et al., Reference Clark, Baune, Schubert, Lavoie, Smesny, Rice and Amminger2016; Michel, Ruhrmann, Schimmelmann, Klosterkotter, & Schultze-Lutter, Reference Michel, Ruhrmann, Schimmelmann, Klosterkotter and Schultze-Lutter2014; Woodberry, Shapiro, Bryant, & Seidman, Reference Woodberry, Shapiro, Bryant and Seidman2016). These research groups have begun to examine prediction models combining clinical indicators with other variables. For example, models have been tested that utilize neurocognitive as well as clinical measures, as these may reasonably be applied in non-academic clinical settings.
The risk calculator (RC) is a tool that uses an individual's scores on a set of measures to yield an estimation of her/his overall level of risk. While used in other areas of medicine (Kattan, Yu, Stephenson, Sartor, & Tombal, Reference Kattan, Yu, Stephenson, Sartor and Tombal2013; Lee et al., Reference Lee, Marcantonio, Mangione, Thomas, Polanczyk, Cook and Goldman1999), it is a novel concept in the field of early psychosis. To the best of our knowledge, one of the most important RC for predicting conversion to psychosis in CHR individuals became available on the Internet in 2016 (http://riskcalc.org:3838/napls/), based on clinical, cognitive, and demographic data from the NAPLS-2 sample (Cannon et al., Reference Cannon, Yu, Addington, Bearden, Cadenhead, Cornblatt and Kattan2016). While currently a research tool, it is possible that with further refinement and cross-validation, it could be implemented in clinical settings where personnel are well-trained on the Structured Interview for Prodromal Syndromes (SIPS). Given this potential, the NAPLS-2-RC stands out as a notable landmark in the early identification of psychosis.
An important question at this stage of development is how the NAPLS-2-RC will work in samples with different cultural and social backgrounds and in other parts of the world such as China. The issue is particularly significant in a populous country like China, which has about 12 million affected patients. The ‘ShangHai At Risk for Psychosis (SHARP)’ team has implemented methods very similar to those used in NAPLS-2 for the identification of CHR individuals in Mainland China since 2010 (Li et al., Reference Li, Zhang, Xu, Tang, Cui, Wei and Wang2018; Zhang et al., Reference Zhang, Li, Woodberry, Seidman, Zheng, Li and Wang2014). In our previous study, 300 SHARP-1 (i.e. the first phase of the SHARP) CHRs, ascertained via identical clinical procedures, were used in an attempt to replicate the predictive accuracy of the NAPLS-2 psychosis RC. Similar predictors were entered into the NAPLS-2 model to generate a psychosis risk estimate for each SHARP-1 case. However, the NAPLS-2-RC did not fit our SHARP data as well as it fit the original North American sample. Probability risk estimates yielded an accuracy of 0.631 (indicating that at least one-third of the estimated risk of Chinese CHR subjects using the NAPLS-2-RC may be false positives) with moderate predictive power, which again was much lower than the NAPLS-2 AUC (0.71) (Zhang et al., Reference Zhang, Li, Tang, Niznikiewicz, Shenton, Keshavan and Wang2018). Development of an RC that is accurate for predicting psychosis in a Chinese population sample is critical to improving early psychosis interventions and research in China.
However, it is becoming clear that we need to also move beyond the calculation of overall psychosis risk estimates to a better understanding of the relative risk of specific contributing factors. To the degree that these risks are malleable, their relative contribution to an individual's overall risk for transition to a psychotic disorder could inform a personalized medicine approach to prevention and early intervention. To our knowledge, no previous study has concomitantly assessed the relative contribution of individual risk components to the overall risk estimates of a CHR sample. To improve the efficacy of early intervention, clinicians need guidance for determining treatment targets and prioritizing intervention strategies. While a personalized risk estimate for CHR subjects can be generated using available RC (such as NAPLS-RC mentioned above) identifying those with higher estimated risk as priority recipients of early intervention services, including more aggressive interventions, the individualization of this intervention is driven by clinical judgment rather than empirical data. Individualized risk data are also needed to inform patient decisions about treatment. Tools to support more individualized assessment and management of risk are needed. Ideally, clinicians can calculate the individual's risk components once the necessary information is available. Mobile phones are particularly accessible and efficient ways for performing such calculations. We endeavored, therefore, to develop a mobile app-based RC for this purpose.
Our goals in this study were twofold: (1) to develop and validate a SHARP-RC for generating a personal estimate of risk for imminent psychotic disorder, and simultaneously (2) to calculate the relative contribution of individual risk components.
Method
Project
The SHARP study represents a collaboration between the Beth Israel Deaconess Medical Center (BIDMC) in the USA (Boston, Massachusetts) and the Shanghai Mental Health Center (SMHC) in China. The Research Ethics Committees at the SMHC and the BIDMC approved these studies. A key element of the SHARP study is that in contrast to many other samples, the CHR participants have had no treatment of any kind for a psychiatric disorder, nor have they taken psychotropic medications. They also did not have any history of substance abuse or dependence according to specific exclusion criteria. Participants are not treated in the study, but receive treatment as usual by their community psychiatrist after their baseline assessment, as needed. As noted above, there were 300 CHRs who were recruited and assessed during 2012–2016 (online Supplementary data-1: eFig. S1).
Sample
All participants agreed to participate in the study. Subjects younger than 18 years of age had their consent forms signed by their parents and the youths gave informed assent. In the SHARP-1 sample, a total of 300 CHRs were identified in the course of face-to-face interviews using the SIPS (Miller et al., Reference Miller, McGlashan, Rosen, Somjee, Markovich, Stein and Woods2002, Reference Miller, McGlashan, Rosen, Cadenhead, Cannon, Ventura and Woods2003). Among them, 228 (76.0%) completed neurocognitive assessments using the Chinese version of the MATRICS Consensus Cognitive Battery (MCCB) (Kern et al., Reference Kern, Nuechterlein, Green, Baade, Fenton, Gold and Marder2008, Reference Kern, Gold, Dickinson, Green, Nuechterlein, Baade and Marder2011; Nuechterlein & Green, Reference Nuechterlein and Green2009) at baseline. Baseline demographic, clinical, and cognitive variables in the SHARP sample are presented in online Supplementary data-1: eTable S1. Additional details of the study have been reported elsewhere (Zhang et al., Reference Zhang, Li, Woodberry, Seidman, Zheng, Li and Wang2014, Reference Zhang, Li, Woodberry, Seidman, Chow, Xiao and Wang2015, Reference Zhang, Li, Woodberry, Xu, Tang, Guo and Wang2017; Zheng et al., Reference Zheng, Wang, Zhang, Li, Li and Jiang2012).
In the second phase of the SHARP study (SHARP-2), 100 CHRs were recruited between 2016 and 2017. Among them, 93 completed cognitive tests and at least 1 year follow-up. Data from these 93 CHRs were used as the validation sample for the SHARP psychosis RC. Baseline demographic, clinical, and cognitive variables in the SHARP-2 sample are presented in online Supplementary data-3.
Clinical outcome variables
Of the total 196 CHRs, 51 (26.0%) converted to full psychosis at 2 years of follow-up. Conversion to psychosis was defined using the POPS (Presence of Psychotic Symptoms in SIPS) (McGlashan, Walsh, & Woods, Reference McGlashan, Walsh and Woods2010) criteria. The conversion was defined as developing at least one psychotic-level symptom (rated ‘6’ on the SIPS positive symptoms scale) with either sufficient frequency or duration.
In our previous investigation (Zhang et al., Reference Zhang, Li, Woodberry, Xu, Tang, Guo and Wang2017), we observed that compared to the NAPLS-2 sample, a substantially higher percentage of participants in SHARP were prescribed antipsychotics after they entered the study (but after their clinical and cognitive assessments were completed). In the current study, among the final sample of 196 CHRs, 157 (80.1%) had taken antipsychotics, 41 (20.9%) had taken antidepressants, and one individual received four sessions of cognitive behavioral therapy by the final follow-up. As mentioned above, the treatments were administered by non-study psychiatrists working in the community after the baseline clinical and neurocognitive assessment.
Follow-up procedures
All the participants from the first visit were followed up for at least 2 years once we attained their consent and intake evaluation information. All the CHRs who completed the baseline assessment were followed up every 6 months. Except for those who did not desire any further contact (17 in a total of 400 CHRs), the CHR participants were re-assessed by telephone at the sixth and 18th months and by face-to-face interview at the 12th and 24th months with the SIPS. The determination of a clinical outcome was based mainly on the face-to-face interviews (of 196 CHRs, 119 had at least once face-to-face interview) and partly from telephone interviews of CHRs.
Data analysis
The exploratory factor analysis procedure was performed using the principal components analysis and varimax rotation with Kaiser normalization. The number of factors retained in the analysis was based on retaining factors that accounted for >10% of the common variance as well as interpretability. Then, using the factor loading coefficients, we calculated the estimated factor scores for each factor for all CHR subjects.
All estimated factor scores were entered into a multivariate model. A Hosmer–Lemeshow goodness-of-fit test was performed to assess the calibration of the predictive logistic regression model. The Wald χ2 statistic was used to test the significance of individual factors in the model. Bootstrap resampling (β = 5000 bootstrap samples) was used to test the robustness of the final predictive model. The bootstrap estimate of bias is an estimate of the bias between a function of the sample and the same function evaluated in the population. These bootstrap-adjusted measures represent the values that can be expected when the model is applied to future similar populations.
In order to apply the SHARP-RC in the most efficient and convenient way possible, a mobile app was designed to input variables, perform calculations, and output the risk estimates and the proportion of each factor loading for individual risk rates. The app introduction and a basic example of how to use it are attached in the online Supplementary data-4 and the Supplementary Instructions video.
Estimated factor scores from each SHARP-2 CHR case were entered into the prediction model, and then a new variable of individual model-predicted risk was constructed. The ROC methodology was used to assess the discriminative power of the model-predicted risk probabilities which were evaluated in terms of discrimination (area under the receiver operating characteristic (ROC-AUC) curve) for the conversion outcome. A plot of the model-predicted risk probabilities v. the actual outcomes was used to assess the calibration performance of the prediction model.
Results
Sample characteristics
Baseline characteristics of CHR subjects are summarized in online Supplementary data-1 eTable S1(a, b). There were significant differences between those who did and did not convert on 11 SIPS variables and three MCCB sub-tests (Table 1).
a Drop GAF: GAF (Global Assessment of Functioning) score baseline from highest in the past year.
b t/Z/χ2: t for independent t test, Z for Mann–Whitney U test (non-parametric test), χ2 for κ test.
Exploratory factor analysis
The exploratory factor analysis of the 14 selected clinical and cognitive variables resulted in four factors (Table 2). The four factors had eigenvalues >1.0, whereas the 10 factors not retained had eigenvalues <1.0. The first factor, with an eigenvalue of 4.99 and high loading coefficients (>0.35) for N1-Social-Anhedonia, N2-Avolition, N3-Expression-of-Emotion, N4-Experience-of-Emotions-and-self, N5-Ideational-Richness, D4-Impairment-in-Personal-Hygiene, was labeled ‘negative symptoms’. The second factor, with an eigenvalue of 1.68 and high loading coefficients for a Drop-in-GAF-score, Current-GAF, N6-Occupational-Functioning, was labeled ‘general function’. The third factor, with an eigenvalue of 1.47 and high loading factors for Trail-Making-Test (TMT), Brief-Assessment-of-Cognition-in-Schizophrenia (BACS), Brief-Visuospatial-Memory-Test (BVMT), was labeled ‘cognitive performance’. Finally, the fourth factor, with an eigenvalue of 1.18 and high loading factors for Total-Positive-Symptoms, D2-Bizarre-Thinking, was labeled ‘positive symptoms’.
GAF, Global Assessment of Functioning; TMT, Part A of Trail Making Test; BACS, Brief Assessment of Cognition in Schizophrenia Symbol Coding Test; BVMT, Revised Brief Visuospatial Memory Test.
Predictive model development
A multiple logistic regression analysis was conducted to predict conversion using estimated factor scores as predictors. Standardized regression coefficients and raw regression coefficients are provided in Table 3. The Hosmer–Lemeshow test showed good calibration for the model (χ2 = 5.520, p = 0.269). The overall model achieved a classification accuracy rate of 78.1%. Bootstrapping confirmed that the multivariate logistic regression equation based on four factors was not overfit to the data, suggesting that our model might be generalized to other CHR samples.
Notes: Beta is the regression coefficient. s.e. is the standard error. 95% CI is the estimated 95% confidence interval for the corresponding parameter. β is the standardized regression coefficient.
Model performance
The values of risk probabilities were generated in the regression model for each case and then used for ROC analysis. Figure 1 shows that discrimination for the conversion outcome was better for risk probabilities [area under the ROC curve 0.78 (95% CI 0.71–0.86), p < 0.001] as compared with each estimated factor [Factor-1 0.65 (0.56–0.74) (p = 0.001), Factor-2 0.58 (0.50–0.67) (p = 0.075), Factor-3 0.66 (0.56–0.76) (p = 0.001), Factor-4 0.64 (0.54–0.73) (p = 0.004)].
Risk components
Individual risk components generated by SHARP model for both the development and validation samples are detailed in online Supplementary data-2 eTable S2 and eFig. S2. No differences were observed between the development and validation samples on risk components or their relative distribution.
Model application
To further describe risk probabilities generated from the current model, Fig. 2 provides frequency distributions of risk probabilities for converters and non-converters. Trend lines (power method) for converters and non-converters on the bar chart were crossed at a model-predicted risk of 0.20. Table 4 provides statistics for the prediction of actual conversion to psychosis across several thresholds of model-predicted risk.
Preliminary calculator validation
The validation sample included 93 CHR subjects who completed 1-year follow-up and baseline cognitive tests, and were comparable on demographic, clinical, and cognitive variables to the development sample (online Supplementary data-3). The SHARP-RC was then used to provide probability estimates of conversion to psychosis for each individual in this sample. The ROC analysis resulted in an AUC of 0.803 (p = 0.003, 95% CI 0.671–0.935) for the probability risk estimates. A predicted risk of 35% provides a better balance between sensitivity and specificity levels at 77.8% and 67.9% with the external validation sample. A predicted risk of 20% (original cutoff) provides sensitivity and specificity levels at 100% and 47.6% with the external validation sample.
Discussion
The SHARP-RC was designed to help better understand and stratify psychosis risk and improve the decision-making in terms of prevention measures. Furthermore, we employed an app-based approach to facilitate data gathering and analyses. To the best of our knowledge, this is the first attempt to develop an app-based RC and also the first RC using a dataset from an Asian sample. Our data show that this app-based SHARP-RC obtained an AUC-ROC of 0.78, which has acceptable discrimination ability and comparable accuracy of psychosis prediction to findings reported in NAPLS-2. This result suggests that the SHARP-RC may be useful in clinical applications in China. More specifically, for those CHR youth with SHARP-RC estimates higher than 20%, these estimates had excellent sensitivity (84%) and good specificity (63%) for the prediction of psychosis. Beside the risk estimates, another advantage over existing RC is that four factors (negative symptoms, general function, cognitive performance, and positive symptoms) loading can be calculated for individuals. This information provides a critical first step toward being able to recommend individualized interventions once there is sufficient evidence in the field of early psychosis [e.g. N-methyl-D-aspartate-receptor modulators (negative symptoms) (Devoe, Peterson, & Addington, Reference Devoe, Peterson and Addington2018), cognitive remediation (cognitive impairments) (Liu, Keshavan, Tronick, & Seidman, Reference Liu, Keshavan, Tronick and Seidman2015; Loewy et al., Reference Loewy, Fisher, Schlosser, Biagianti, Stuart, Mathalon and Vinogradov2016), low-dose antipsychotic medications (positive symptoms) (Fusar-Poli, Valmaggia, & McGuire, Reference Fusar-Poli, Valmaggia and McGuire2007; McGorry et al., Reference McGorry, Yung, Phillips, Yuen, Francey, Cosgrave and Jackson2002), rehabilitation-focused psychological interventions (general function) (Stafford, Jackson, Mayo-Wilson, Morrison, & Kendall, Reference Stafford, Jackson, Mayo-Wilson, Morrison and Kendall2013)].
Consistent with the NAPLS-2-RC (Cannon et al., Reference Cannon, Yu, Addington, Bearden, Cadenhead, Cornblatt and Kattan2016; Carrion et al., Reference Carrion, Cornblatt, Burton, Tso, Auther, Adelsheim and McFarlane2016) and our previous findings (Li et al., Reference Li, Zhang, Xu, Tang, Cui, Wei and Wang2018; Zhang et al., Reference Zhang, Li, Woodberry, Xu, Tang, Guo and Wang2017), the baseline severity level of positive symptoms, global function decline, and the BACS neurocognitive tests were significant predictors of psychosis in the SHARP sample. Of note here, evidence has been accumulating that baseline thought disorder symptoms and functional deterioration are key risk factors for the onset of psychosis in CHR syndromes (Fusar-Poli et al., Reference Fusar-Poli, Borgwardt, Bechdolf, Addington, Riecher-Rossler, Schultze-Lutter and Yung2013). Our data highlight the importance of a declining GAF score for the prediction of psychosis. In addition to the NAPLS-2-RC components (positive symptoms, poor cognition, and functional decline), the SHARP-RC included negative symptoms as a predictor variable. Increasing evidence (Healey et al., Reference Healey, Penn, Perkins, Woods, Keefe and Addington2018; Piskulic et al., Reference Piskulic, Addington, Cadenhead, Cannon, Cornblatt, Heinssen and McGlashan2012) supports baseline negative symptoms as key risk factors for predicting the onset of psychosis.
In contrast to the NAPLS-2 model, scores on the HVLT-R neurocognitive tests, age, and family history of a psychotic disorder were not included in the SHARP-RC. One possible cause of this discrepancy is the difference in the method of predictor variable selection between SHARP (data-driven) and NAPLS-2 (experience-driven, in which indicator selection was based on empirical links to psychosis prediction in two or more prior studies of CHR cases). In addition, mean HVLT-R scores in the SHARP sample were different from in the NAPLS-2 sample; here, the effect of differences in reliability or validity of MATRICS tests across populations cannot be ruled out.
It has been reported that the performance of the NAPLS-2-RC for predicting conversion is good in the development sample (0.71) and in the validation EDIPPP sample (0.79). In comparison, the current study also found similar performance (0.78) in the SHARP-1 development sample and SHARP-2 validation sample (0.80). However, when using NAPLS-2-RC for calculating the risk for the SHARP-1 sample, it only reached modest performance (0.63) (Zhang et al., Reference Zhang, Li, Tang, Niznikiewicz, Shenton, Keshavan and Wang2018), which implies the possible limits of generalization across different regions. One potential source of this discrepancy may be that the conversion rate in the SHARP sample is higher than in both the NAPLS-2 and EDIPPP samples. This implies that the SHARP sample may be a higher risk sample or received less treatment than samples in the other two studies. The recruitment in SHARP is confined to a clinical setting which differs substantially from NAPLS-2 and EDIPPP, which are heavily reliant on intensive community outreach and American mental health service delivery systems.
Strengths and limitations: Data-driven calculators are generally limited in their ability to explain the logic underlying the resulting algorithm. A strength of our study is that the four factors generated in the SHARP-RC were identical and robust to the findings in CHR literature. More importantly, these four factors are not only used for the calculation of psychosis risk, but also provided critical information on the risk composition, which is greatly valuable in making an early intervention plan for clinicians. Another strength is that in contrast to the NAPLS-2 data, which were collected from eight sites (Addington et al., Reference Addington, Cadenhead, Cornblatt, Mathalon, McGlashan and Perkins2012), the current SHARP sample was recruited by one team from one catchment area, which may be more advantageous for its homogeneity. However, the use of a single center in sample recruitment could also limit our ability to generalize the findings. A limitation of the study is that it is based only on the sample that received follow-up assessments for 2 years. Despite the fact that the majority of conversions happen in the first 2 years (Fusar-Poli et al., Reference Fusar-Poli, Bonoldi, Yung, Borgwardt, Kempton, Valmaggia and McGuire2012), we cannot exclude the chance that non-converters were incorrectly classified. We note that the ongoing SHARP program will further validate this model with a complete 4–7 years follow-up sample. It is also important to note that this CHR cohort was surveyed naturalistically, and the various medications the participants took with varying compliance during the follow-up period may have confounded the results of clinical outcome assessments, thereby limiting the generalizability to CHR subjects who do not take any medication. Moreover, although only 32 CHRs were lost in follow-up, they demonstrated more severe positive symptoms and poorer functioning at baseline compared with those who completed follow-up, which could bias our results by underestimating the clinical severity of our sample. Finally, as emphasized by Cannon et al. (Reference Cannon, Yu, Addington, Bearden, Cadenhead, Cornblatt and Kattan2016) and Carrion et al. (Reference Carrion, Cornblatt, Burton, Tso, Auther, Adelsheim and McFarlane2016), the RC remains experimental. It should only be used in research settings and with clinicians who have had rigorous SIPS training (SIPS scores being at the core of the model) at this point, and not yet used in general clinical settings with individuals until there are appropriately trained clinicians in those settings and the apps' clinical utility and properties are validated more firmly.
In summary, data reported in the present study pioneers the development and validation of the first individualized psychosis RC from an Asian population. The developed mobile App-based SHARP-RC is well-performing, widely compatible, and easily applicable for clinical services and research. More importantly, this App-based RC can be used as the stand-alone version, and no need to connect to the Internet. Therefore, this characteristic effectively guaranteed the security of CHRs' personal information. Risk estimates of higher than 20% could be a cut-off point for clinical application of the SHARP-RC. More severe levels of negative symptoms, positive symptoms, poor general and cognitive functioning were highly important predictors in the SHARP sample. Although validation of the SHARP-RC in other external datasets is needed, at present the SHARP-RC appears to have considerable potential for predicting possible clinical outcomes and for providing a foundation for Chinese and other clinicians to make treatment and stepped-care decisions for those who meet CHR criteria.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S003329171900360X.
Acknowledgements
This study was supported by the Ministry of Science and Technology of China, National Key R&D Program of China (2016YFC1306800), National Natural Science Foundation of China (81671329, 81671332, 81361120403), Shanghai Jiaotong University Foundation (ZH2018ZDB03, ZH2018QNB19, YG2016QN42), Shanghai Key Laboratory of Psychotic Disorders (13dz2260500), Science and Technology Commission of Shanghai Municipality (19441907800, 17411953100, 16JC1420200), The Clinical Research Center at Shanghai Mental Health Center (CRC2018ZD01, CRC2018ZD04, and CRC2018YB01), and Shanghai Mental Health Center Foundation (2017-TSXK-03). This study was also supported by an R21 Fogarty/NIMH (1R21 MH093294-01A1), ‘Broadening the Investigation of Psychosis Prodrome to Different Cultural Groups’ by an R21 NIMH ‘Enhancing Intervention of Attenuated Psychosis Syndrome with M-Health Technology’ (1R21MH113674) and by a US-China Program for Biomedical Collaborative Research (R01) (1R01 MH 101052-01), ‘Validating Biomarkers for the Prodrome and Transition to Psychosis in Shanghai’ and by the United States NIMH (K23 MH102358).
Conflict of interest
None of the authors had a conflict of interest.