Depression is a burdensome disease with a high prevalence, affecting one in 20 adults at any one time (Thornicroft et al., Reference Thornicroft, Chatterji, Evans-Lacko, Gruber, Sampson, Aguilar-Gaxiola and Kessler2017). Not reaching full remission after initial treatment is a strong predictor of poor long-term prognosis including relapse and recurrence of depression (Buckman et al., Reference Buckman, Underwood, Clarke, Saunders, Hollon, Fearon and Pilling2018b; Judd et al., Reference Judd, Paulus, Schettler, Akiskal, Endicott, Leon and Keller2000). Knowledge of factors associated with prognosis can be useful for patients and clinicians informing the content of routine clinical assessments and decisions regarding the future clinical management of the patient's condition, and providing them with information they want to know (Trusheim, Berndt, & Douglas, Reference Trusheim, Berndt and Douglas2007).
There have been a number of different approaches towards studying prognosis. For adults with depression, it has most commonly been studied in systematic reviews or randomised controlled trials (RCTs) that have focused on prognosis for those receiving a single treatment – typically, an antidepressant or cognitive behavioural therapy (Bower et al., Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liu2013a, Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liub; Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016; Driessen, Cuijpers, Hollon, & Dekker, Reference Driessen, Cuijpers, Hollon and Dekker2010; Karyotaki et al., Reference Karyotaki, Riper, Twisk, Hoogendoorn, Kleiboer, Mira and Cuijpers2017). Such studies might identify a mixture of general prognostic factors applicable regardless of treatment type and prognostic factors unique to that treatment modality, but due to their design, they cannot distinguish between the two. For example, the predictive models from STAR*D that examined outcomes on the antidepressant citalopram were found to generalise to escitalopram–bupropion but not to venlafaxine–mirtazapine (Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016). At the outset of treatment, it is impossible to know what future treatments a patient will receive so general information about prognosis, that would apply to all treatments, is of clinical value (Hippisley-Cox et al., Reference Hippisley-Cox, Coupland, Vinogradova, Robson, May and Brindle2007; Trusheim et al., Reference Trusheim, Berndt and Douglas2007); this can be called ‘prognosis independent of treatment’. Another approach to studying prognostic factors is to identify people with depression from cohort studies. Most cohorts have small numbers of people with depression and many have not sought treatment (Buckman et al., Reference Buckman, Underwood, Clarke, Saunders, Hollon, Fearon and Pilling2018b; Hardeveld, Spijker, De Graaf, Nolen, & Beekman, Reference Hardeveld, Spijker, De Graaf, Nolen and Beekman2009). Therefore, inferences about prognosis from these samples can be imprecise and might not be generalisable to the population of help-seeking patients who are seen by clinicians. The approach taken in the current study is to examine data from the individual participants of a wide range of RCTs that have investigated a breadth of pharmacological, psychological and other interventions, amongst individuals seeking treatment for depression, and to partial out the effects of the randomisation in each study, to investigate the associations between patient characteristics and prognosis. In theory, depending on the breadth of the treatments used in the contributing studies, this approach allows for the investigation of prognostic factors that apply to any course of treatment and should therefore be more generalisable to a wider range of clinical circumstances.
Meta-analyses of individual patient data (IPD) collected from RCTs can provide an improved understanding of factors associated with prognosis independent of treatment (Bower et al., Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liu2013a, Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liub; Driessen et al., Reference Driessen, Cuijpers, Hollon and Dekker2010; Gibbons, Hur, Brown, Davis, & Mann, Reference Gibbons, Hur, Brown, Davis and Mann2012) as they are able to deliver greater power and therefore more precise estimates than individual studies or study-level meta-analyses (Driessen et al., Reference Driessen, Cuijpers, Hollon and Dekker2010; Fisher, Carpenter, Morris, Freeman, & Tierney, Reference Fisher, Carpenter, Morris, Freeman and Tierney2017; Stewart et al., Reference Stewart, Clarke, Rovers, Riley, Simmonds, Stewart and PRISMA-IPD Development Group2015). A meta-review of systematic reviews and meta-analyses, including IPD meta-analyses, was conducted to inform the methods and focus of the current study (online Supplementary Tables 1 and 2). That meta-review established that there is strong evidence of an association between the severity of depressive symptoms pre-treatment and prognosis with particular treatments (Bower et al., Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liu2013a, Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liub; Driessen et al., Reference Driessen, Cuijpers, Hollon and Dekker2010; Weitz et al., Reference Weitz, Hollon, Twisk, Van Straten, Huibers, David and Cuijpers2015). However, there is uncertainty over the strength and the clinical importance of the association due to a lack of reporting of effect sizes (Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016; Noma et al., Reference Noma, Furukawa, Maruo, Imai, Shinohara, Tanaka and Cipriani2019), and wide confidence intervals (CIs) in the studied effects (Fournier et al., Reference Fournier, DeRubeis, Hollon, Dimidjian, Amsterdam, Shelton and Fawcett2010; Johnsen & Friborg, Reference Johnsen and Friborg2015; Weitz et al., Reference Weitz, Hollon, Twisk, Van Straten, Huibers, David and Cuijpers2015). As noted above, there is also the possibility that these associations are limited to patients receiving particular type of treatment only, given the focus of past studies (Bower et al., Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liu2013a, Reference Bower, Kontopantelis, Sutton, Kendrick, Richards, Gilbody and Liub; Driessen et al., Reference Driessen, Cuijpers, Hollon and Dekker2010; Karyotaki et al., Reference Karyotaki, Riper, Twisk, Hoogendoorn, Kleiboer, Mira and Cuijpers2017). As such, the current evidence may not be useful for clinicians wanting to inform patients of their prognosis before a decision has been made regarding the type of treatment to start, or in settings where the particular treatments studied are not available.
The meta-review identified a number of other potential prognostic factors, including life events, social support and socio-demographics which are beyond the scope of the current study (Buckman et al., Reference Buckman, Saunders, O'Driscoll, Cohen, Stott, Ambler and Pilling2021), and several others which are related to the severity of the mental health problem a patient with depression might present with in a clinic. These severity-related factors can be referred to as depressive ‘disorder characteristics’, in contrast to depressive ‘symptom severity’. Some of these ‘disorder characteristics’ such as duration and comorbidity with anxiety have been reported to be associated with response to a particular treatment [e.g. citalopram in STAR*D (Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016)], but there have been inconsistent findings and most studies did not adjust for depressive symptom severity (Johnsen & Friborg, Reference Johnsen and Friborg2015; Nakabayashi, Hara, & Minami, Reference Nakabayashi, Hara and Minami2018; Noma et al., Reference Noma, Furukawa, Maruo, Imai, Shinohara, Tanaka and Cipriani2019). One study has found an interaction between duration and symptom severity suggesting that considerations of prognosis should not be limited to symptom severity alone (Lorenzo-Luaces, Rodriguez-Quintana, & Bailey, Reference Lorenzo-Luaces, Rodriguez-Quintana and Bailey2020). However, that was a study of adolescents, in a single sample, with two treatment types, their combination, or placebo, and was not able to consider a broader spectrum of depressive ‘disorder characteristics’. Previous studies have also rarely included data from primary care settings, or provided insufficient information about how participants were recruited to know if the results are generalisable to other health care settings. Large proportions of adults seeking treatment for depression present in primary care (Olfson, Blanco, & Marcus, Reference Olfson, Blanco and Marcus2016; Thornicroft et al., Reference Thornicroft, Chatterji, Evans-Lacko, Gruber, Sampson, Aguilar-Gaxiola and Kessler2017), so identifying prognostic factors in a primary care setting has important utility.
This study aimed to provide clinically useful estimates for prognostic factors that would apply whatever treatment a patient would receive. The specific aims were to investigate: (1) the degree to which depressive symptom severity is associated with prognosis for adults with depression in primary care, independent of treatment type; and (2) which depressive ‘disorder characteristics’ are associated with prognosis independent of treatment type, and independent of depressive symptom severity.
This study involved compiling an IPD from RCTs of adults with depression that sought treatment in primary care. In order to thoroughly investigate the association between depressive ‘disorder characteristics’ and prognosis a measure that captures a comprehensive set of such clinical features is required. Scoping searches were conducted to identify the most commonly used measure of this type in RCTs that recruited adults with depression in primary care; we established that this was the Revised Clinical Interview Schedule (CIS-R) (Lewis, Pelosi, Araya, & Dunn, Reference Lewis, Pelosi, Araya and Dunn1992). The CIS-R is a measure commonly used in RCTs and epidemiological studies that has been translated into many languages (McManus, Bebbington, Jenkins, & Brugha, Reference McManus, Bebbington, Jenkins and Brugha2016; Subramaniam, Krishnaswamy, Jemain, Hamid, & Patel, Reference Subramaniam, Krishnaswamy, Jemain, Hamid and Patel2006). It is used to measure symptoms and make diagnostic determinations of depressive and anxiety disorders in line with criteria from the International Classification of Diseases 10th edition (World Health Organization, 1992). CIS-R is most commonly administered via a computerised program such that lay personnel can conduct the interviews, reducing clinical time and cost (Subramaniam et al., Reference Subramaniam, Krishnaswamy, Jemain, Hamid and Patel2006). The use of this measure at baseline was made an inclusion criterion for the searches in order to minimise bias in harmonising data across RCTs. The methods for this systematic review and IPD meta-analysis were pre-registered (Buckman et al., Reference Buckman, Saunders, Cohen, Clarke, Ambler, DeRubeis and Pilling2020) [PROSPERO: CRD42019129512 (01/04/2019)]; for details of protocol amendments and derivations, see online Supplementary materials.
Identification and selection of studies
Studies were identified via searches on Medline, Embase, PsycINFO and Cochrane Central (inception to 1st December 2020), hand-searching of reference lists, and contacting experts for unpublished or missed studies. Full details of the searches are provided in online Supplementary Table 3.
Inclusion and exclusion criteria
Studies were included if they: were RCTs of adults (aged ⩾16 years) with unipolar depression, or with depressive symptoms significant enough for them to seek treatment, or a CIS-R (Lewis et al., Reference Lewis, Pelosi, Araya and Dunn1992) score of ⩾12 (the usual case definition for a common mental disorder); recruited from primary care; had at least one active treatment arm; and used the CIS-R at baseline.
Studies were excluded if they were studies of: patients with depression secondary to a diagnosis of personality disorder, psychotic conditions or neurological conditions; bipolar or psychotic depressions; children or adolescents; feasibility or were studies of adults with either depression or an anxiety disorder, rather than a primary depression with or without comorbid anxiety.
See Table 1 for details of the included studies.
ADM, antidepressant medication; BDI-II, Beck Depression Inventory; EPDS, Edinburgh Postnatal Depression Scale; GHQ-12, General Health Questionnaire 12 item version; HADS-D, Hospital Anxiety and Depression Scale-depression subscale; iCBT, internet based therapist delivered cognitive behavioural therapy; MDD, major depressive disorder; T0, baseline; TAU, treatment as usual; TCA, tricyclic antidepressant.
The measures of depressive symptoms used to determine depressive ‘symptom severity’ and outcomes are noted in Table 1; details of all measures are given in online Supplementary Table 4.
Ethical considerations and trial registrations
All included studies were granted ethical approvals and all participants gave informed consent (online Supplementary Table 5). No additional NHS ethical approval was required for this study: HRA reference 712/86/32/81.
Data analysis plan
Details on determining study inclusion, data extraction, data handling and data management, risk of bias and study quality, secondary outcomes and sensitivity analyses, and results from these, are provided in online Supplementary materials.
The primary outcome was depressive symptoms at 3–4 months. This was captured in two ways: (1) z-score (standardised mean) of the scores on the four depressive symptom measures used at 3–4 months post-baseline in each study (Table 1). The score at 3–4 months was divided by the standard deviation for that measure calculated at 3–4 months. (2) The logarithm of depression scale scores irrespective of the measure used. Exponentiation of the regression coefficient provides an estimate of the percentage difference in symptoms.
It was expected that these methods would give broadly similar results but that the log outcome might have greater clinical utility as percentage differences might be more easily understood and do not require division by standard deviation estimates.
Prognostic indicators under consideration
(1) Depressive symptom severity at baseline taken as scores on the depressive symptom measures is detailed in Table 1.
(2) Depressive ‘disorder characteristics’:
• the sum of the scores on the CIS-R anxiety subscales, and each individual subscale
• the number of comorbid common mental health disorders (CMDs), and each individual disorder
• the duration of depression
• the duration of anxiety individually and averaged across CIS-R anxiety subscales
• a history of depression
• a history of any previous treatment for depression
• a history of antidepressant treatment
• the degree of functional impairment
• alcohol misuse
Two-stage random effects meta-analyses were conducted for each prognostic factor. This approach removes variance due to the different depressive symptom measures used across the studies, removes potential biases by separating within-study from between-study effects, and allows for more simple formations of forest plots and hence for the assessment of heterogeneity than one-stage approaches (Fisher, Reference Fisher2015; Fisher et al., Reference Fisher, Carpenter, Morris, Freeman and Tierney2017). It does so by analysing effects within each study first, before aggregating across studies. One-stage approaches have been favoured in other IPD meta-analyses as they allow for more complex modelling (Cuijpers et al., Reference Cuijpers, Weitz, Twisk, Kuehner, Cristea, David and Siddique2014; Weitz, Kleiboer, Van Straten, Hollon, & Cuijpers, Reference Weitz, Kleiboer, Van Straten, Hollon and Cuijpers2017). However, as no complex modelling was necessary here, the two-stage approach was most suitable for the aims of the current study (Fisher, Reference Fisher2015).
There were three sets of variables adjusted for in models of each outcome built for each prognostic factor:
(1) The ‘disorder characteristic’ adjusted for age, gender and the specific randomised treatment(s) in each study.
(2) As in (1) with the addition of depressive symptom severity.
(3) As in (2) with the addition of covariates specific to each prognostic indicator.
Covariates were added to the models above if they were: independently associated with the outcome and prognostic indicator; not multi-collinear with prognostic indicators in the model; not systematically missing and if they impacted the effect estimate for the association between prognostic indicator and outcome when included compared to when excluded from the model. Two factors considered a priori to be important covariates (age and gender) were controlled for in all models.
Final models were built with the primary outcomes adding each prognostic indicator to the model in order of magnitude of effect from model 3 (one-by-one), and removing those no longer significantly associated with prognosis (at the 5% significance level) after adding subsequent factors. If two items were highly collinear the one contributing least to the model was removed. In the final models, ordinal variables were re-categorised to assess the associations with prognosis in clinically meaningful groups (e.g. duration items were re-categorised into durations at baseline of less than or equal to 1 year, and greater than 1 year). The explanatory utility of the final models was assessed by considering the amount of variance in depressive symptom scale scores at 3–4 months explained by the models when adding each variable one-by-one, using the adjusted R 2 statistic; for details of how this was calculated see online Supplementary materials.
Meta-analyses were conducted using DerSimonian and Laird random effects models with the ‘ipdmetan’ package in Stata (Fisher, Reference Fisher2015). For the z-score and log outcomes at 3–4 months and 6–8 months (secondary outcome) linear regression models were fitted. Logistic models were fitted for remission (secondary outcome). The degree of heterogeneity was assessed using prediction intervals and its impact was assessed using the I 2 statistic (Higgins, Thompson, Deeks, & Altman, Reference Higgins, Thompson, Deeks and Altman2003).
Characteristics of the included studies
In total, 13 RCTs met inclusion criteria (Fig. 1). Data were not available for one study (Mynors-wallis, Gath, Day, & Baker, Reference Mynors-wallis, Gath, Day and Baker2000). Descriptions of the included studies are given in Table 1. Risk of bias was low in all studies and quality was rated as high (online Supplementary Tables 6 and 7).
A key question in this study was whether or not adjusting for depressive symptom severity ameliorates the associations between depressive ‘disorder characteristics’ and prognosis independent of treatment, therefore descriptive statistics are presented stratified by a median split of depressive symptom severity (Table 2). Means and standard deviations in each strata are presented across all studies. Those with higher depressive symptom severity were more likely to have: identified as female; more comorbid mental health problems; longer durations of their mental health problems; lower social support; lower health-related quality of life; more adverse life events and greater social disadvantages, than those with lower baseline scores (Table 2).
Note: Numbers do not add up to total N due to missing data.
The association between depressive symptom severity and prognosis independent of treatment
Overall, depressive symptom severity was strongly associated with prognosis at 3–4 months post-baseline. On average, scores at 3–4 months were approximately 31% higher per standard deviation increase in depressive symptoms at baseline (Table 3).
a Adjusted for treatment allocation, age and gender only.
b Adjusted for baseline depression scale z-score, age, gender and treatment allocation.
c Additionally adjusted for.
d Employment status.
e Marital status.
f Employment status and marital status.
Associations between each potential depressive ‘disorder characteristic’ and prognosis
All depressive ‘disorder characteristics’ studied here were associated with prognosis at 3–4 months post-baseline independent of treatment, apart from a comorbid diagnosis of specific phobias, and hazardous alcohol misuse (Table 3). However, after adjustment for baseline depressive symptom severity, there was only evidence of a few ‘disorder characteristics’ being associated with prognosis. Patients with longer durations of depression or of anxiety had poorer prognoses than those with shorter durations. Similarly, patients with a history of depression or treatment for depression had poorer prognoses than those without such histories. However, there was no evidence that functional impairment or most comorbid diagnoses were associated with prognosis after adjusting for depressive symptom severity and covariates (model 3), with the exception of Chronic Fatigue Syndrome, and Panic Disorder.
Findings were consistent when using the z-score and log outcomes with one exception in model 3: using the z-score there was some evidence that each of the three variables capturing history of depression were associated with prognosis, but no such evidence when using the log outcome.
Independent associations between depressive ‘disorder characteristics’ and prognosis
Many ‘disorder characteristics’ were missing in two studies (Kendrick et al., Reference Kendrick, Peveler, Longworth, Baldwin, Moore, Win and Sussex2006; Salisbury et al., Reference Salisbury, Cathain, Edwards, Thomas, Gaunt, Hollinghurst and Montgomery2016). The difference when including or excluding those studies on the effects of variables that were not systematically missing in any study were negligible, see online Supplementary materials. These studies were therefore removed from further primary analyses.
There was only evidence of an association with prognosis for six ‘disorder characteristics’ after adjusting for treatment, depressive symptom severity, covariates and other ‘disorder characteristics’ (online Supplementary Fig. 1). The associations for these six factors were similar across studies with potentially different populations, e.g. in those with ‘treatment resistant depression’ (COBALT), those with apparently less severe depression at baseline (PANDA) and those with postnatal depression (RESPOND); see online Supplementary Fig. 1.
Four ‘disorder characteristics’ were included in the final models in addition to depressive symptom severity (Table 4): duration of depression, average duration of anxiety symptoms, comorbid panic disorder and a history of antidepressant treatment. Although the latter was only significantly associated with prognosis when using the z-score outcome, when removing two studies with little variability in this factor due to their inclusion criteria (COBALT and MIR – see Table 1) there was greater evidence for an effect with the log outcome: 6.3% (95% CI: 0.3–12.7). It is noteworthy too that there was 0% heterogeneity in this effect, so there were no substantive differences in the association for studies that randomised to antidepressant treatments and those that did not. The sum of the anxiety subscale scores on CIS-R, and a history of any previous treatment for depression could be included in the final model in place of the average duration of anxiety and a history of antidepressant treatment, respectively, although had weaker associations with outcomes than those retained in the final models (online Supplementary Table 12).
a Adjusted for depressive symptom severity, treatment allocation, age, gender, employment status and marital status.
b Adjusted for depressive symptom severity, depression duration, average anxiety duration, panic disorder, history of antidepressants, treatment allocation, age, gender, employment status and marital status. All models excluded data from AHEAD and HEALTHLINES.
c Using z-score at 3–4 months as the outcome.
d Using the natural log of the depressive symptom scale scores at 3–4 months.
e Dichotomised to less than or equal to 1-year, and greater than 1-year duration.
Patients that had durations of depression and anxiety greater than 1 year, had comorbid panic disorder and a history of antidepressant treatment, i.e. those in the ‘high severity’ category on the above variables (n = 220), had on average 36.3% (95% CI: 12.4–65.2) higher scores at 3–4 months than patients with none of the above (n = 707). Adding all four ‘disorder characteristics’ to models in addition to depressive symptom severity led to substantial gains in the variance explained in the primary outcomes, which increased with each factor added (online Supplementary Table 10).
In this systematic review with IPD meta-analyses it was found that depressive symptom severity was strongly associated with prognosis independent of treatment. Depressive symptom scale scores were on average 31% higher at 3–4 months, 33% higher at 6–8 months and the odds of remission at 3–4 months were approximately halved, for every standard deviation increase in baseline depressive symptoms. Absolute differences were also assessed: for every 11-point increase in BDI-II scores at baseline, scores were about 7 points higher at 3–4 months on average, and for the studies that used the PHQ-9 for each 5-point increase at baseline scores approximately 5 points higher at 3–4 months.
Nearly all ‘disorder characteristics’ were associated with prognosis independent of treatment but only a handful were associated with prognosis independent of depressive symptom severity. This illustrates the importance of adjusting for baseline depression symptom severity when investigating prognosis of depression. The factors independently associated with prognosis were: duration of depression; average duration of anxiety (or severity of anxiety symptoms); comorbid panic disorder and a history of antidepressant treatment (or history of any treatment for depression). The history of treatment variables were not as consistently associated with outcomes as the other factors we identified and the association was relatively small.
There was a lack of evidence for an independent association between functional impairment and prognosis. Functional impairment has been found to be indicative of treatment response for people with either depression or anxiety disorders (Delgadillo, Moreea, & Lutz, Reference Delgadillo, Moreea and Lutz2016; Saunders, Buckman, & Pilling, Reference Saunders, Buckman and Pilling2020; Saunders, Cape, Fearon, & Pilling, Reference Saunders, Cape, Fearon and Pilling2016), so the single item used to capture it here might be insufficient. There was also a lack of evidence to support an association between hazardous alcohol misuse and prognosis, this is in line with previous study that has found it to be related to dropping out of treatment but not to treatment outcomes apart from when patients are alcohol dependent (Boschloo et al., Reference Boschloo, Vogelzangs, Van Den Brink, Smit, Veltman, Beekman and Penninx2012; Buckman et al., Reference Buckman, Naismith, Saunders, Morrison, Linke, Leibowitz and Pilling2018a, Reference Buckman, Underwood, Clarke, Saunders, Hollon, Fearon and Pillingb).
Findings in context
This study provides confirmation that depressive symptom severity is the strongest indicator of prognosis independent of treatment. A number of other studies have found symptom severity to be associated with outcomes but none have considered the association independent of a broad range of commonly available treatments in primary care settings. In addition, given the sample size in this IPD meta-analysis this study was able to address the question of the strength and the clinical importance of the association between symptom severity and prognosis with greater precision than has been possible in other studies (Fournier et al., Reference Fournier, DeRubeis, Hollon, Dimidjian, Amsterdam, Shelton and Fawcett2010; Johnsen & Friborg, Reference Johnsen and Friborg2015; Weitz et al., Reference Weitz, Hollon, Twisk, Van Straten, Huibers, David and Cuijpers2015). This study was also the first to comprehensively investigate associations between ‘disorder characteristics’ and prognosis independent of depressive symptom severity. There had been some suggestion from past studies that the duration of depression might be associated with prognosis (Carter et al., Reference Carter, Cantrell, Zarotsky, Haynes, Phillips, Alatorre and Marangell2012; DeRubeis et al., Reference DeRubeis, Cohen, Forand, Fournier, Gelfand and Lorenzo-Luaces2014; Fournier et al., Reference Fournier, DeRubeis, Shelton, Hollon, Amsterdam and Gallop2009; Lorenzo-Luaces et al., Reference Lorenzo-Luaces, Rodriguez-Quintana and Bailey2020; Noma et al., Reference Noma, Furukawa, Maruo, Imai, Shinohara, Tanaka and Cipriani2019) although there were inconsistencies and contradictory findings in past reviews (see online Supplementary Table 2) (Dodd et al., Reference Dodd, Berk, Kelin, Zhang, Eriksson, Deberdt and Craig Nelson2014). In addition, there was limited evidence that comorbid anxiety (Carter et al., Reference Carter, Cantrell, Zarotsky, Haynes, Phillips, Alatorre and Marangell2012; Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016) and a history of antidepressant use (Chekroud et al., Reference Chekroud, Zotti, Shehzad, Gueorguieva, Johnson, Trivedi and Corlett2016; Nakabayashi et al., Reference Nakabayashi, Hara and Minami2018; Saunders et al., Reference Saunders, Cape, Fearon and Pilling2016) may be associated with outcomes from antidepressant treatments but perhaps not other types of treatment. Here, these were found to be associated with prognosis independent of treatment type, and two novel prognostic factors were also found: the average duration of anxiety problems and comorbid panic disorder.
Strengths and limitations
There are a number of strengths of the current study. A large dataset was assembled with approximately 98% of the participants in all eligible studies. Over 6000 participants were assessed with the most commonly used comprehensive measure of depressive and anxiety disorders in depression RCTs set in primary care: the (CIS-R), this provided a broad range of prognostic factors to investigate and removed potential biases in harmonising data (Siddique et al., Reference Siddique, Reiter, Brincks, Gibbons, Crespi and Brown2015; Weitz et al., Reference Weitz, Kleiboer, Van Straten, Hollon and Cuijpers2017). All studies recruited those that had sought treatment in naturalistic, primary care settings. Follow-up rates were generally good and missing data at follow-up had little influence on the findings. A wide range of treatments were used within the randomised studies, including antidepressants, cognitive behavioural therapy of high and low intensities, physical activity and supportive counselling. Causal relationships were not the focus of the current study so confounding was not particularly relevant, but adjustments were able to be made for a number of baseline covariates, adding robustness to the findings. A variety of methods were adopted to assess outcomes and these led to very similar findings, further supporting the conclusions of the current study.
The samples included in this review had been recruited to participate in RCTs, and only studies recruiting in the UK met the inclusion criteria, perhaps due to the use of CIS-R, which may be less familiar to investigators outside the UK. Although it was more commonly used than other clinical interviews, it might have been possible to include studies using those less commonly used interviews too, and then have conducted subgroup analyses per-measure to address issues of harmonising biases. That notwithstanding, there were a number of studies that used the CIS-R and were conducted outside the UK, and a number of studies that used other clinical interviews returned in the scoping searches or full searches for this review, however they often did not meet other inclusion criteria for this review (Husain et al., Reference Husain, Chaudhry, Fatima, Husain, Amin, Chaudhry and Creed2014; Patel et al., Reference Patel, Chisholm, Rabe-hesketh, Dias-saxena, Andrew and Mann2003). The inclusion of only RCTs and only those that used the CIS-R may have led to a biased sample of all patients with depression and could limit the generalisability of the findings. However, 11 of the 12 studies were pragmatic trials and recruited adults with new episodes of depression, so the participants should be representative of other depressed patients presenting to general practitioners/physicians and psychiatrists across the world. Furthermore, 11 of the 12 studies recruited participants that had actively sought treatment for depression; the other used a variety of methods including recruiting participants as they sought treatment but also calling those that had sought treatment for depression over the previous 2 years pre-baseline and asking if they were willing to be randomised to receive treatment for depression, or a placebo (Lewis et al., Reference Lewis, Duffy, Ades, Amos, Araya, Brabyn and Lewis2019). The uniformity of the setting offers an improvement in the extant literature in which there has been limited information about from where participants were recruited (Dal-Ré, Janiaud, & Ioannidis, Reference Dal-Ré, Janiaud and Ioannidis2018). Furthermore, there is nothing to suggest that the samples drawn from UK primary care are substantially different to samples of adults with depression seeking treatment elsewhere, and the treatments used in the included studies are common in many countries.
The data on duration were self-reported and relied upon a retrospective judgement; that is likely to have increased measurement error. It is possible that those with more depressive symptoms reported longer durations of illness because of negative cognitive biases, but adjustments were made for baseline depressive symptoms minimising such bias. In any case, knowing that reported duration is a prognostic factor might be of clinical value even if this could be partly influenced by symptom severity (Lorenzo-Luaces et al., Reference Lorenzo-Luaces, Rodriguez-Quintana and Bailey2020).
Heterogeneity in some of the associations was high when considering the I 2 statistic, in the study protocol it was specified that sensitivity analyses would be run where I 2 was above 75% for all factors or above 50% for factors that included in the final models or if there were clear differences between the effects across the studies included in the IPD. More conservative limits for heterogeneity could have been set, but given that none of the sensitivity analyses substantively changed the findings related to any of the prognostic indicators and given that all models were run with random effects for study, it seems unlikely that this would have had a meaningful impact on the results presented here.
Implications and conclusions
The differences in prognosis observed here were compared with a published estimate for the minimally important clinical difference. Previous study has suggested this is approximately 17.5% in terms of BDI-II scores (Button et al., Reference Button, Kounali, Thomas, Wiles, Peters, Welton and Lewis2015). The finding that one standard deviation increase in baseline depressive symptoms led to an approximate 31% difference at endpoint therefore suggests that such a change is clinically important. Four additional factors: the duration of anxiety; duration of depression; comorbid panic disorder and a history of antidepressant treatment were also independently associated with poorer prognosis. These depressive ‘disorder characteristics’ are not likely to be associated with clinically important differences when considered alone, but they might be when considered concurrently. For example, those in the ‘high severity’ category of all four factors had outcome symptom scores 36% higher than those in the lowest category. Although this only applied to a small proportion of the patients more than 86% in this sample were in the ‘high severity’ category on at least two of these factors. It may therefore be important for clinicians to assess for all of these factors routinely, pre-treatment. All could be easily captured in clinic or with brief online questionnaires. Assessment of these factors would improve clinicians' ability to predict prognosis.
Future research should ascertain what other factors are informative for prognosis after accounting for depressive ‘disorder characteristics’, whether effects are informative for treatment selection, and whether earlier or more intensive treatments, and more frequent reviews for those likely to have poor outcomes help mitigate these problems, and conversely whether more conservative management is sufficient for those with better prognoses.
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291721001367.
This study was supported by the Wellcome Trust through a Clinical Research Fellowship to JB (201292/Z/16/Z), Medical Research Council (Programme for IW: MC_UU_12023/21), MQ Foundation (for ZC: MQDS16/72), the Higher Education Funding Council for England, the National Institute of Health Research (NIHR), NIHR University College London Hospitals Biomedical Research Centre (RS, KC, PB and SP), NIHR Biomedical Research Centre at the University Hospitals Bristol and Weston NHS Foundation Trust and the University of Bristol (NW and DK), University College London (GA and GL), University of Pennsylvania (RDR), Vanderbilt University (SDH), University of Southampton (TK), University of Exeter (EW) and University of York (SG). The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.
The studies that make up the Dep-GP IPD database were funded by:
1. AHEAD: Health Technology Assessment programme of the National Health Service Research and Development Directorate.
2. CADET: UK Medical Research Council (MRC; reference G0701013), managed by the National Institute for Health Research (NIHR) on behalf of the MRC-NIHR partnership.
3. COBALT: The National Institute for Health Research Health Technology Assessment (NIHR HTA) programme (project number 06/404/02).
4. GENPOD: Medical Research Council and supported by the Mental Health Research Network.
5. HEALTHLINES: NIHR under its Programme Grant for Applied Research (Grant Reference Number RP-PG-0108-10011).
6. IPCRESS: BUPA Foundation.
7. ITAS: Primary Secondary Care Interface initiative of the United Kingdom National Health Service Research and Development Programme (PSI 2-58).
8. MIR: National Institute for Health Research (NIHR) Health Technology Assessment (HTA) programme (project 11/129/76) and supported by the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol.
9. PANDA: NIHR Programme Grant for Applied Research (RP-PG-0610-10048).
10. REEACT: UK National Institute for Health Research (NIHR) Health Technology Assessment (HTA) programme (project 06/43/05).
11. RESPOND: HTA programme as project number 02/07/04.
12. TREAD: National Institute for Health Research (NIHR) Health Technology Assessment (HTA) programme.
The funders of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report. All authors were fully independent of their respective funders and had responsibility for the decision to submit for this manuscript for publication.
Conflict of interest
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional guides on the care and use of laboratory animals.