The burden of treatment-resistant depression (TRD) is challenging to quantify. It has eluded a universal definitionReference Fekadu, Donocik and Cleare1 but is prevalent and encompasses considerably greater severity, chronicity, recurrence, admission to hospital and comorbidity with both psychiatric and non-psychiatric disorders than non-resistant major depressive disorder (MDD).Reference Fekadu, Wooderson, Markopoulo, Donaldson, Papadopoulos and Cleare2 Despite this, TRD has been a neglected area of research with numerous reviews calling for more comprehensive evidence. Indeed, many of these reviews have considered people as treatment resistant if they have failed one previous treatment trial (in contrast with the most popular guidelinesReference Fekadu, Donocik and Cleare1), in part because this represents the inclusion criteria frequently used in clinical trials. One such example examined the pharmacological augmentation treatments that the majority of TRD patients are treated with in practice.Reference Zhou, Ravindran, Qin, Del Giovane, Li and Bauer3 Only when using the less-stringent criteria of TRD was there sufficient evidence for a network meta-analysis in 2015,Reference Zhou, Ravindran, Qin, Del Giovane, Li and Bauer3 and the authors reported significant efficacy of quetiapine, aripiprazole, lithium and thyroid hormone compared with placebo. However, this evidence may not apply to people with more severe TRD. Pre-post analyses have the benefit of not requiring a placebo arm and the ability to compare effectiveness estimates between heterogeneous treatment approaches.Reference Bandelow, Reitt, Röver, Michaelis, Görlich and Wedekind4 Additionally, pre-post effect sizes (ESs) provide good clinical face validity as an estimate of the magnitude of effects seen with treatment in practice, incorporating both those specific to the individual modality as well as non-specific effects and the passage of time.Reference Bandelow, Reitt, Röver, Michaelis, Görlich and Wedekind4
This review aimed to qualify and quantify the evidence of augmentation treatments for TRD by using the most common clinical definition (i.e. two or more failed treatments in current episode) and to compare ESs across psychological and pharmacological interventions. To our knowledge, this is the first meta-analysis comparing pre-post treatment effects for all augmentation therapies across the two most popular treatment classes for depression in clinical practice. Specifically, our objectives were to:
(a) determine the efficacy of adjunctive interventions for TRD through comparisons between treatment category (i.e. pharmacological or psychological), class (e.g. antipsychotics, mood stabilisers) and individual treatments;
(b) provide an indication of the acceptability and tolerability of these treatments.
Criteria for considering studies for the review
The protocol for this systematic review was published via PROSPERO (registration code CRD42018088009),Reference Strawbridge, Carter, Marwood, Taylor, Mantingh and Nikolova5 where full details of the search are available and reported consistently following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) reporting guidelines.
Types of included studies
Only randomised controlled trials (RCTs) of at least 10 participants and at least one suitable augmentation treatment were included.
Types of participants
Participants must have been adults with TRD, defined as unremitted depression despite at least two courses of treatment of adequate dose and duration undertaken in the current episode (current best-practice guidelinesReference Conway, George and Sackeim6). Both within-class (in addition to between-class) switching of antidepressants and psychological treatments were permitted as they are considered valid contributors to a TRD definition.Reference Fekadu, Donocik and Cleare1 Studies including patients with psychotic or bipolar depression were excluded because of clear treatment distinctions.
Types of interventions
Participants must have been taking at least one continuation treatment prior to randomisation to a new (augmentation) intervention. The same eligibility criteria were employed for both continuation and augmentation treatments: permitted pharmacological treatments were any included in the Maudsley Treatment InventoryReference Fekadu, Donocik and Cleare1 and psychological treatments from the National Institute for Health and Clinical Excellence depression guidelines7 or those with multiple meta-analyses supporting use in depression. Eligible comparator treatments included pill placebo, another pharmacological agent, another psychological intervention, waiting list, active control or treatment as usual (TAU).
Types of outcome measures
The primary outcome was clinical improvement (ES) between pre- and post-treatment time points for each eligible treatment/comparator arm. One efficacy measurement was selected, prioritising validated, clinician-rated measures of depression severity (and if not available, a patient-rated depression scale or assessment of global improvement if no depression symptom scale was reported).
A measure of adherence (e.g. trial dropout due to any cause or treatment adherence data) and a measure of tolerability (e.g. adverse event or side-effects data) were recorded where available.
Search methods for identification of studies
MEDLINE and Institute for Scientific Information (ISI) Web of Science were searched in addition to citation lists from notable papers, available reviews and included articles. The following medical subject headings or text word terms were used for the electronic database search (all fields): (depress* OR MDD OR major depress*) AND (resistan* OR refractor* OR non-respon* OR nonrespon* OR un-respon* OR unrespon* OR TRD OR fail* OR inadequate OR difficult OR intractable) AND (augment* OR adjunct* OR add-on OR combin* OR co-administ*) AND (randomi* OR RCT) AND (treatment OR intervention OR trial). No language restriction was made.
Data collection and analyses
Article review and data extraction
All search results were evaluated against inclusion criteria independently by pairs of review authors (R.S., L.M., R.T., T.M., V.d.A., D.T., V.L.N. and F.P.), with disparities addressed by consensus with additional review authors (A.H.Y., A.J.C., B.C.). Following inclusion, data extraction was conducted by authors as above.
The methodological quality assessment was examined using the Scottish Intercollegiate Guidelines Network8 and the Cochrane Risk of Bias (RoB)Reference Higgins and Green9 tools. Studies were assessed by two reviewers (rated as RoB high, low or unclear) for nine domains: sequence generation, allocation concealment, blinding of outcome assessors, use of intention-to-treat analysis, comparability of randomised groups at baseline, inter-site differences in findings, the potential for selective outcome reporting and presence of for-profit bias (allegiance). Using individual criterion ratings, each study was given an overall RoB rating of low, moderate or high (see Supplementary Table 1 available at https://doi.org/10.1192/bjp.2018.233).
Measures of treatment effect
Continuous data describing treatment effectiveness were extracted (e.g. pre- and post-severity scores or longitudinal change in severity scores) and presented as a standardised mean difference (Hedges’ g ES). Using a random-effects model, meta-analyses computed a pooled ES with 95% confidence intervals (CIs), P-values and the I 2 statistic. Statistical heterogeneity was considered important if I 2 exceeded 60%Reference Higgins and Green9 and explored using subgroups. The following comparisons were planned to assess the primary outcome:
(a) pooled effects of augmentation intervention/comparator categories (i.e. psychological treatment, psychological comparator, pharmacological treatment and pharmacological comparator);
(b) pooled effects of augmenters by class (e.g. selective serotonin reuptake inhibitor, serotonin–noradrenaline reuptake inhibitor, antipsychotic, mood stabiliser);
(c) pooled effects of individual treatment interventions within above categories.
The following secondary outcomes were quantitatively or qualitatively explored: acceptability, tolerability and an exploration of pairwise active-control comparisons to provide an indicated effect of treatment versus comparator trial arm, validating findings against the current gold standard.Reference Cuijpers, Weitz, Cristea and Twisk10
Subgroups used to explore heterogeneity
Planned subgroups used to explore statistical heterogeneity included study quality (RoB) and trial duration, as well as participant treatment-resistance definition, continuation treatments, comorbidities, depression severity, duration of episode and treatment setting.
Changes made since protocol registration
The permitted range of treatment duration was amended from a range of 6–26 weeks to include any duration where expectations of clinical efficacy were reported. This was to account for the variable windows of clinical efficacy between different treatment mechanisms (e.g. ketamine, which has well-documented rapid antidepressant effects). Excluding ketamine, the Maudsley Treatment Inventory recommends durations of 6 weeks for full clinical effect;Reference Fekadu, Donocik and Cleare1 therefore we selected to subgroup included trials of less than 6 weeks as ‘short-term’ (this excludes rapid-onset treatments such as ketamine)Reference Fekadu, Donocik and Cleare1 and those more than 26 weeks as ‘long-term’ treatment durations.
Systematic search results
After duplicates were removed, 2246 manuscripts from the MEDLINE and ISI Web of Science databases (all years to 6 February 2018) and hand searches were screened. Of 297 full texts reviewed, 39 articles describing 28 studies were eligible for inclusion. A PRISMA flow chart presents a breakdown of the search process (Fig. 1).
Characteristics of included studies
Within the 28 included RCTs, 5461 TRD participants were randomised. All analysed interventions were of parallel-group studies, with ten trials (36%) conducted in North America, seven (25%) in Europe, six (21%) in Asia, four (14%) across multiple continents and one (4%) in South America. The mean study size was 199 (s.d. = 270, range 20–1293). The duration of interventions ranged from 5 days (ketamineReference Su, Chen, Li, Lin, Hong and Gueorguieva11) to 18 months (long-term psychoanalytic psychotherapyReference Fonagy, Rost, Carlyle, Mcpherson, Thomas and Fearon12), with a median duration of 6 weeks (interquartile range = 2).
Characteristics of participants
Participants had a median age of 45 years (interquartile range = 4) and 66% were female. All analysed individuals had unremitted depression despite at least two adequate treatment trials in the current episode. A total of 15 studies defined TRD fully retrospectively (using a minimum duration of previous treatments of 4 or 6 weeks), whereas 12 required at least one unsuccessful treatment retrospectively and one prospectively. One study undertook two treatment trials to determine treatment resistance fully prospectively.Reference Yoshimura, Kishi, Hori, Ikenouchi-Sugita, Katsuki and Umene-Nakano13 Most studies did not consider psychological treatments to contribute to TRD definition; only Fonagy et al Reference Fonagy, Rost, Carlyle, Mcpherson, Thomas and Fearon12 required one pharmacological and one psychological treatment failure as a minimum TRD criterion for study entry. Table 1 contains further details.
TRD, treatment-resistant depression; ADM, antidepressant medication; OP, out-patient; NOR, nortriptyline; Mono, monotherapy; TAU, treatment as usual; Poly, polytherapy continuation treatment; IP, in-patient; NR, not reported; PAR, paroxetine; ESC, escitalopram; FLU, fluoxetine; SER, sertraline; VEN, venlafaxine; FLUV, fluvoxamine; MIL, milnacipran; DUL, duloxetine; MBCT, mindfulness-based cognitive therapy; HEP, health education programme; LTPP, long-term psychoanalytic psychotherapy; CBT, cognitive–behavioural therapy; CIT, citalopram; STAR*D, Sequenced Treatment Alternatives to Relieve Depression; BUP, bupropion.
a. Describes eligible TRD subgroup where full sample did not meet review inclusion criteria.
Supplementary Table 1 contains the RoB ratings across criteria and studies. A total of 12 studies were rated as having a low RoB,Reference Fekadu, Wooderson, Markopoulo, Donaldson, Papadopoulos and Cleare2, Reference Su, Chen, Li, Lin, Hong and Gueorguieva11, Reference Bandelow, Reitt, Röver, Michaelis, Görlich and Wedekind4–Reference Thase, Youakim, Skuban, Hobart, Zhang and McQuade23 12 had a moderate RoBReference Baumann, Nil, Souche, Montaldi, Baettig and Lambert24–Reference Shelton, Tollefson, Tohen, Stahl, Gannon and Jacobs35 and 4 had a high RoB.Reference Yoshimura, Kishi, Hori, Ikenouchi-Sugita, Katsuki and Umene-Nakano13, Reference Nierenberg, Fava, Trivedi, Wisniewski, Thase and McGrath36–Reference Schindler and Anghelescu38 The most common individual criteria rated as high RoB were being funded and/or conducted by an industrial sponsor (12 trials) and not applying or reporting an intention-to-treat analysis (seven trials). Blinding was not always maintained but was often maximised where possible, i.e. in the ketamine trial (reportedly double-blindReference Su, Chen, Li, Lin, Hong and Gueorguieva11), psychological trials (two out of three trials report blinding of outcome assessorsReference Fonagy, Rost, Carlyle, Mcpherson, Thomas and Fearon12, Reference Eisendrath, Gillung, Delucchi, Segal, Nelson and McInnes18) and open-label studies (all but oneReference Schindler and Anghelescu38 reporting blinded outcome raters).
Effectiveness of augmentation treatment
There was clinical diversity in the design (see Table 1); intervention and outcomes were reported (see Supplementary Table 2) across studies.
Pre-post meta-analyses indicated improvements in depression with all interventions examined (P < 0.001). From 23 studies including 3246 participants, pharmacological treatments yielded an overall ES of 1.15 (95% CI 1.01–1.29, I 2 = 82.7). Psychological therapies as a category comprised 3 studies totalling 276 participants, showing similar effects (ES = 1.43, 95% CI 0.50–2.36, I 2 = 95.3). For the majority of initial analyses conducted, severe heterogeneity limited the interpretability of comparisons (see Supplementary Table 3). The three studies with a high RoB contributed substantially to this heterogeneity, demonstrating either lowReference Nierenberg, Fava, Trivedi, Wisniewski, Thase and McGrath36 or highReference Yoshimura, Kishi, Hori, Ikenouchi-Sugita, Katsuki and Umene-Nakano13, Reference Schindler and Anghelescu38 outlier ESs and the subgroup of active treatments trialled for a short-term duration (lithium,Reference Baumann, Nil, Souche, Montaldi, Baettig and Lambert24 metyraponeReference McAllister-Williams, Anderson, Finkelmeyer, Gallagher, Grunze and Haddad14) showed an ES of 0.61 (95% CI 0.37–0.85, I 2 = 0); their removal from meta-analyses notably reduced heterogeneity. In contrast, long-term treatment trials of lithiumReference Girlanda, Cipriani, Agrimi, Appino, Barichello and Beneduce15 and psychoanalytic psychotherapyReference Fonagy, Rost, Carlyle, Mcpherson, Thomas and Fearon12 were homogeneous (ES = 0.67, 95% CI 0.44–0.90, I 2 = 4.6) and did not affect heterogeneity of main analyses so were not excluded. Effects of all placebo trials (pill ES = 0.78, psychological ES = 0.94) exhibited findings similar to the sub-therapeutic-duration pharmacological studies (ES = 0.61) and were consistently lower than active treatments; see Fig. 2 and Table 2. All active treatment effects are displayed in Supplementary Fig. 1 and control arms in Supplementary Fig. 2.
Results of meta-analyses assessing treatment effectiveness at a category, class and individual intervention level for studies with data available for meta-analyses without a high risk of bias accounting for therapeutic duration of interventions trialled (for results including high risk of bias and short-term durations, see Supplementary Table 3). Bold text indicates pooled effects of each treatment category. k, number of studies; n, number of participants; ES, effect size (Hedges’ g); n/a, not applicable; NMDA, N-methyl-d-aspartate; SARI, serotonin antagonist and reuptake inhibitor; MBCT, mindfulness-based cognitive therapy; LTPP, long-term psychoanalytic psychotherapy; CBT, cognitive–behavioural therapy; TAU, treatment as usual.
a. Pill placebo, without the two short-term studies whose active treatments had been excluded from their respective analyses, was as effective without these two studies (ES = 0.82, 95% CI 0.69–0.95) and as heterogeneous (I 2 = 68.3).
Pharmacological treatment classes
Pharmacological interventions without high RoB and trialled for a therapeutic duration had an ES of 1.19 (95% CI 1.08–1.30; I 2 = 64.6).
N-methyl-d-aspartate (NMDA)-targeting drugs showed the most consistent and large ES of the pharmacological classes (ES = 1.48, 95% CI 1.25–1.71, I 2 = 0), despite the individual agents having different mechanisms of action.
Mood stabilisers demonstrated an overall ES of 1.12 (95% CI 0.92–1.31, I 2 = 23.6), exhibiting low heterogeneity only. Lithium was the most frequently investigated mood stabiliser and had a slightly smaller ES than the overall class without heterogeneity (three studies; ES = 1.00, 95% CI 0.81–1.20, I 2 = 0).
Antipsychotics also had an ES of 1.12 (95% CI 0.98–1.26, I 2 = 75.0) and exhibited heterogeneity, likely due to differences between treatments within this class. Aripiprazole was the most frequently assessed antipsychotic and provided a consistent effect across four studies (ES = 1.33, 95% CI 1.23–1.44, I 2 = 0).
Medications not falling into the above mechanisms were grouped together (trazodone, buspirone, thyroid hormone and dexmecamylamine) and showed an ES of 1.36 (95% CI 1.09–1.63, I 2 = 46.4), comparable in terms of heterogeneity and ES to the other pharmacological treatments.
Psychological treatment classes
The overall ES of psychological therapies (three studies; ES = 1.43, 95% CI 0.50–2.36) contained substantial heterogeneity (I 2 = 95.3), likely due to different therapeutic modalities that we were not able to subgroup further due to lack of studies. Within this analysis, cognitive–behavioural therapy (CBT) had the highest ES of all individual treatments (one study;Reference Hauksson, Ingibergsdottir, Gunnarsdottir and Jonsdottir25 ES = 1.74), whereas psychoanalytic psychotherapy had the smallest (one study;Reference Fonagy, Rost, Carlyle, Mcpherson, Thomas and Fearon12 ES = 0.59).
Publication bias was not apparent (detail available from the author on request).
Active versus control (pairwise) meta-analyses
Only three treatment classes were examinable due to data availability. Despite heterogeneity of therapies and studies, they proposed that psychological treatments were more beneficial than usual care or an active control (three studies; ES = 0.45, 95% CI 0.09–0.81, I 2 = 63.8). Antipsychotics showed effectiveness when compared to placebo (seven studies; ES = 0.38, 95% CI 0.18–0.58, I 2 = 59.4). The number of mood-stabiliser studies was lower in the pairwise comparison than in pre-post analyses due to a paucity of placebo-controlled trials, and they were not significantly more effective than placebo (four studies; ES = 0.13, 95% CI −0.14 to 0.39, P = 0.34, I 2 = 0).
Tolerability and acceptability
Tolerability and acceptability were defined differently between studies and were not sufficiently homogeneous to consider quantitatively in meta-analyses.
Eight studies reported the total number of adverse events occurring in each arm; they were higher in active versus placebo arms for most interventions but equal between active and placebo arms in the d-cycloserineReference Heresco-Levy, Gelfin, Bloch, Levin, Edelman and Javitt16 and minocyclineReference Husain M, Chaudhry, Husain, Khoso, Rahman and Hamirani17 trials. This rate might be heavily influenced by a large number of adverse events occurring in a minority of patients, and of seven studies reporting the percentage of participants experiencing at least one adverse event, most were similar between treatment arms.
The highest dropout rate was in the ziprasidone intervention (41% in the lower-dose arm).Reference Dunner, Amsterdam, Shelton, Loebel and Romano26 There was a discrepancy of more than 10% in participant dropout between arms in this study, as well as in the studies by Heresco-Levy et al Reference Heresco-Levy, Gelfin, Bloch, Levin, Edelman and Javitt16 (d-cycloserine 23% versus placebo 11%) and Husain et al Reference Husain M, Chaudhry, Husain, Khoso, Rahman and Hamirani17 (minocycline 24% versus 10% placebo). No dropouts were reported in the CBT trial arms of TAU and individual CBT (two participants withdrew from group CBT)Reference Hauksson, Ingibergsdottir, Gunnarsdottir and Jonsdottir25 or from the 1-week lithium placebo-controlled study (either trial arm).Reference Baumann, Nil, Souche, Montaldi, Baettig and Lambert24
Meta-analytic estimates of treatment effects for resistant and non-resistant depression
In contrast to TRD, progress is evolving regarding the comparative effectiveness of common treatments for MDD, as exemplified by a recent, extensive network meta-analysis: Cipriani et al Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson and Ogawa39 identified over 500 double-blind randomised trials of antidepressant monotherapy for MDD (in contrast to the 28 that we found for TRD augmentation) and found all to be significantly more effective than placebo. Another meta-analysis of pharmacological augmentation treatments for depression non-responsive to one or more antidepressant reported comparable ESs.Reference Zhou, Ravindran, Qin, Del Giovane, Li and Bauer3 We anticipate smaller ESs within TRD populations. The greatest pre-post effect of augmentation that we report is for medications targeting the NMDA receptor, i.e. ketamine (antagonist), d-cycloserine (partial agonist) and minocycline (antagonist). This finding supports increasing attention towards drugs acting on this pathway, as illustrated by a network meta-analysis of pharmacological and somatic treatments for non-responsive depression reporting ketamine to have the strongest short-term efficacy of treatments studied.Reference Papadimitropoulou, Vossen, Karabis, Donatti and Kubitz40 It is notable, however, that this finding was based on three studies only; population or design differences between studies may have yielded stronger effects in these trials than if directly compared with other interventions. Ketamine produced the highest ES of the NMDA medications but it is particularly challenging to maintain interviewer blinding with this treatment, although Su et al Reference Su, Chen, Li, Lin, Hong and Gueorguieva11 reported the trial as double blind. Based on a larger number of studies, our findings also indicate that for individuals with a history of two unsuccessful treatments in the current episode, aripiprazole is effective, but it is important to note that all trials investigating aripiprazole had a potential allegiance effect. The evidence is less certain (often assessed in open-label designsReference Girlanda, Cipriani, Agrimi, Appino, Barichello and Beneduce15, Reference Bauer, Dell'Osso, Kasper, Pitchot, Vansvik and Koehler28, Reference Schindler and Anghelescu38) but promising for lithium. The World Federation of Societies of Biological Psychiatry Task Force recommends lithium as the first-line augmentation option for TRD, and quetiapine or aripiprazole as alternatives;Reference Bauer, Pfennig, Severus, Whybrow, Angst and Möller41 however, we identified only one randomised quetiapine trial in the current review (found to be non-inferior to lithium). As such, it is clear that much more work in this field is required.
Effects of interventions versus placebo in randomised studies for TRD
Even a pill placebo response is variable under some methodological conditions, suggesting that there is some small scope for improvement for individuals with TRD without augmenting with a new active treatment. The ES and confidence intervals for placebo were heterogeneous across studies (as displayed in Supplementary Fig. 2), demonstrating that indeed there are limitations to inferring the relative effects of interventions across diverse investigations. Placebo and active-treatment outcomes will have been influenced by a multitude of factors which differed across trials (including but not limited to the maintenance of blinding, analyses undertaken, inclusion criteria relating to comorbidities, severity, etc.). Notwithstanding, it does appear that both psychological and pharmacological treatments are more effective than either pill or psychological controls alone, even for already resistant people. Specifically, the treatment classes whose pre-post confidence interval did not overlap with the pill-placebo estimates were mood stabilisers, antipsychotics, NMDA drugs and medications with ‘other’ mechanisms. This was not the case for psychological treatments which contained a wide confidence interval or for short-term treatment durations.
Effectiveness of psychological versus pharmacological intervention
For MDD, psychological therapies demonstrate overall comparable ESs to pharmacological interventions, according to a meta-analysis of direct comparisons.Reference Amick, Gartlehner, Gaynes, Forneris, Asher and Morgan42 The most recent review investigating psychological treatments for TRD identified only two randomised studies, both underpowered and defining TRD loosely; one had found comparable benefits of CBT and antidepressants, whereas the other reported clinical benefits of CBT but not antidepressants.Reference McPherson, Cairns, Carlyle, Shapiro, Richardson and Taylor43 The importance of building the psychological evidence base is clear and we predict that over the next decade growing efforts in this field will reduce the current uncertainty of their effectiveness for this population.Reference Holmes, Ghaderi, Harmer, Ramchandani, Cuijpers and Morrison44
Many psychological trials were excluded from the current review as they focused on chronicity or recurrence of depression rather than on the number of failed treatments. This limitation reflects the lack of integration between psychological and pharmacological fields and the difficulty in operationalising a measure of treatment response, particularly for past psychological therapies (including treatment adequacy, adherence, dose, duration, intensity and other factors likely to influence outcome). The CoBalT RCT has been seminal in the field, finding CBT adjunct to usual care as clinically effective (odds ratio of 3.26),Reference Wiles, Thomas, Abel, Ridgway, Turner and Campbell45 but was not eligible for inclusion in the current review due to only requiring non-response to 6 weeks of one ongoing antidepressant. It is important to note that for most people with TRD, a combination of pharmacological and psychological approaches may be the most effective treatment both in terms of acute response and relapse prevention,Reference Cleare, Pariante, Young, Anderson, Christmas and Cowen46 although only pharmacological continuation treatments were focused on in the original studies included in this review.
Limitations and strengths
This work highlights the weakness of the evidence base for augmentation treatments for TRD. Inconsistency of TRD definition excluded a large number of studies, and mediating and moderating factors (such as TRD or baseline severity, continuation treatments and case mix of included patients) limited the ability to control confounders. However, 23 studies comprising a total of 5034 participants exhibited consistency of findings. Limited comparable data were available on the tolerability and acceptability alongside effectiveness and we were not able to consider the influence of patient/investigator blinding, intention-to-treat analyses or allegiance effects in meta-analyses. These factors may have influenced ESs, although they have not notably affected similar results in other reviews.Reference Bandelow, Reitt, Röver, Michaelis, Görlich and Wedekind4, Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson and Ogawa39 Due to the limited number of psychological studies included, uncertainty remains over the benefit of CBT, psychoanalytic therapy and mindfulness-based cognitive therapy in this population.
Meta-analytic comparisons between treatment types have been deemed unsuitable (unless compared directly in original studies), but pre-post meta-analysis provides indications of effectiveness that can be compared between modalities. The pre-post analysis approach may show larger ESs due to spontaneous or natural remission or patient expectations of effectiveness,Reference Cuijpers, Weitz, Cristea and Twisk10 but the likelihood of this is attenuated in TRD populations who have experienced non-effective treatments and have a lower natural recovery rate than MDD as a whole. These also therefore reflect effects as seen in real-world clinical practice. Pre-post analysis has the advantage of permitting comparisons between different treatment types and controls, which is not suitable for traditional meta-analysis (e.g. drug placebo pills have a larger effect than a waiting-list control,Reference Bandelow, Reitt, Röver, Michaelis, Görlich and Wedekind4 although no waiting-list controls were examined in the present studies). In spite of these advantages it must be highlighted that indirectly comparing ESs between treatments in this way does not account for between-study variability (including but not limited to sociodemographic and clinical differences between recruited participants, the adequacy and delivery of treatment, and other procedural and analytic distinctions).
There has been continued controversy surrounding the comparison of psychological and medication-based treatment for depression. We have not found strong evidence that either method is more effective in TRD specifically, although we highlight an urgent need for more intensive investigation of psychological therapy programmes. This study also illustrates that a short duration of treatment affects outcomes more than differences between treatment modalities. However, our results indicate that both psychological and pharmacological treatments are more effective than either pill or psychological control, even for already resistant people. Far from being ‘lost causes’, our findings demonstrate that more therapeutic work is needed to achieve an optimal response for this subpopulation. Specifically, clinicians should not rule out CBT if it is being delivered with sufficient intensity and by skilled therapists.Reference Hauksson, Ingibergsdottir, Gunnarsdottir and Jonsdottir25 Our findings also confirms previous work indicating that aripiprazole and – to a lesser extent – lithium are effective treatments, supporting their current recommendation as first-line therapies.Reference Maes, Vandoolaeghe and Desnyder31 Although the measured ESs with these two pharmacotherapies are similar to other options, the fact that they have been more thoroughly investigated in a larger number of studies underlines their status as first-choice options. Although unconfirmed, even if some medication-based treatments are shown to have greater efficacy overall in TRD, treatment decisions should necessarily remain a clinical judgement. Clinicians need to balance difficulties with tolerability of medications in addition to the durability of effects and, vitally, patient preference when deciding on the most appropriate treatments to use.
Supplementary material is available online at https://doi.org/10.1192/bjp.2018.233.
This study represents independent research part funded by the National Institute for Health Research (NIHR) Biomedical Research Centre (BRC) at South London and Maudsley National Health Service Foundation Trust (SLaM) and King's College London. The NIHR BRC had no involvement in study design, data collection, analysis or the decision to submit for publication. The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR or the Department of Health.