Evidence-based policy and practise is firmly established in mental healthcare, bringing with it a need for evidence to be critically appraised. The conventional biomedical view sees randomised controlled trials (RCTs) as the gold standard against which all other forms of evidence should be judged. The suitability of RCTs for evaluating complex mental health interventions has long been questioned. An alternative approach to evidence, realist evaluation, has been developed and advocated across other policy areas. Realism challenges the foundations on which scientific knowledge is created and represents a paradigm shift. We outline the advantages of realist evaluation for mental health research and explore what this might mean for the current hierarchy of evidence on which policy and commissioning decisions are based.
Randomised controlled trials
RCTs have played a vital role in the development of evidence-based mental healthcare. When applied in their ideal, or close to ideal form, their ability to control for confounding, reduce bias and elucidate the direction of causation has enabled robust evaluation (and quantification) of the effectiveness of interventions. Estimates of ‘overall’ net effect sizes derived from trial results have allowed policy makers and commissioners to estimate the health economic consequences of delivering interventions to target populations.
However, work across a range of disciplines has suggested that the strengths of RCTs may not be sufficient to warrant their inherent place as the gold standard.Reference Deaton and Cartwright1 Many of the criteria for a well-conducted RCT are often not achieved in practise because of attrition, lack of blinding and other biases post-randomisation. Moreover, RCTs are intrinsically better suited to some areas and research questions than others. Mental healthcare, other than that limited to pharmaceutical interventions, is one area where RCTs fit less well.Reference Priebe and Slade2 Psychological treatments rely on human agency (and specifically interpersonal interactions) and are therefore harder to manualise than drug treatments.Reference Carey and Stiles3 Unlike pharmaceutical interventions, psychological and behavioural interventions typically change during treatment. The skill of therapy, after all, lies in the ability to understand and meet patients' individual needs and to respond to therapeutic progress or stasis. Complex mental health interventions, based on multiple interactive components and centred on interpersonal and informational activities, are problematic within an RCT paradigm.
Recent moves towards personalised medicine represent a further challenge to the predominant position of RCTs. Average treatment effects, even if estimated on the basis of well-conducted trials, apply to groups rather than individuals, and hence do not apply equally to everyone. This is especially so when treatments are multifaceted and target individual human agency and interactions. This also applies to social and other contexts: what works in one setting may not work in another. The social, economic and cultural contexts in which patients and complex interventions are embedded will, therefore, have an effect at many levels, and although data arising from RCTs may be used to test for subgroup effects (i.e. interactions between allocation group and factors such as patient characteristics or setting), these tests are limited in number and often lack statistical power.
There is a need for alternative approaches to the evaluation of complex mental health interventions. Crucially, approaches are needed that recognise, examine and evaluate variation and sources of variation in patient outcomes, with particular reference to the role of context.
Realist evaluation is one such approach. Arising from the work of scientific realists, most particularly Pawson and Tilley, the central formulation in realist evaluation concerns context, mechanism and outcome.Reference Pawson and Tilley4 Causation that generates outcomes is established through identifying mechanisms that may be activated or lie dormant, depending on context. Pawson and Tilley emphasise that interventions, in and of themselves, do not ‘work’ but rest instead on the choices, intentions and behaviours of social actors as they engage with the mechanisms that interventions contain. Thus, outcomes of interventions derive from the operation of specific mechanisms in particular contexts; using an often-quoted analogy from the physical world, sparks lead to gunpowder exploding but only under specific conditions (the presence of oxygen, a dry atmosphere, etc.). The mechanism is based on the chemical composition of gunpowder whereas the conditions derive from the context. The context-dependent operation of mechanisms leads, in turn, to the development of so-called ‘CMO configurations’ – specific instances of contexts and mechanisms that are hypothesised to produce particular patterns of outcomes.
Realist evaluation and networks of evidence
According to realist theory, evaluation cannot simply be concerned with estimating net effect sizes, i.e. in finding out whether a complex intervention works overall or ‘on average’. Instead, researchers should seek to establish what works, for whom, under what circumstances and why. Crucially, it is the careful thinking about and exploration of mechanisms – which pragmatic RCTs eschew – that allows researchers to uncover why interventions work and not simply what interventions work overall for populations of trial participants.
Given this formulation, realist evaluation is concerned with building ‘programme theories’ that aim to capture the mechanisms underlying complex interventions in the contexts in which they work. These theories are developed and refined through an iterative process centred on the findings of empirical research (networks of evidence), which may include but does not prioritise the results of RCTs. In this way, realist evaluation does not favour any particular research design or methods of analysis. Certainly, intensive (qualitative) methods have a role to play, especially those that move beyond simply recording the perceptions and views of those involved in interventions. Equally, quantitative observational research can capture variation in outcomes that may be overlooked in trials that are concerned with evaluating overall effectiveness, and are typically based on far smaller and more homogenous (and hence less representative) samples than studies using routine clinical data-sets.
A more open and diverse approach to evidence does not, however, constitute an ‘anything goes’ attitude or a relinquishing of scientific values. Rather, as Pawson puts it, evaluating interventions in complex social worlds becomes the scientific task of ‘scavenging’ amongst a range of evidence developed through a plurality of methods and subjecting each part of it to ‘organised scepticism’.Reference Pawson5 Methodological guidelines, training resources and quality and reporting standards have been produced for this task by the Realist And Meta-narrative Evidence Syntheses: Evolving Standards (RAMESES) project, funded by the National Institute for Health Research.Reference Greenhalgh, Wong, Westhorp and Pawson6
Can realist evaluation and RCTs ever coexist?
It has been suggested that adopting a realist perspective could lead to significant improvements in the way RCTs are conducted. Triallists have also made their own advances with the basic parallel group design evolving to more advanced designs such as stepped wedge and patient preference designs, which are better suited to evaluating complex interventions.
Realism may bring additional gains to the conduct of RCTs. The emphasis on underlying casual mechanisms highlights the importance of elucidating theories around ‘mechanisms of action’ before the actual running of trials. Paradoxically, although this type of work-up is often conducted in relation to trials of specific clinical interventions (e.g. pharmaceutical and psychotherapeutic interventions), it is often missing from the development of more complex psychosocial interventions centred on mental health service delivery and patient behaviour.
Realist evaluation might also be seen as a means of improving parallel process evaluations of RCTs of complex interventions. For example, the work of Byng and colleagues on the Mental Health Link in London provided information that enabled the results of the accompanying RCT to be interpreted, as well as providing key insights into the core functions of the intervention.Reference Byng, Norman, Redfern and Jones7
Realist evaluation principles have also been used to shape the design and implementation of RCTs themselves, leading to the creation of ‘realist RCTs’.Reference Bonell, Fletcher, Morton, Lorenc and Moore8 Although similar to realist-based qualitative process evaluation, realist RCTs also consist of quantitative mediation and moderation analyses that aim to capture variations in intervention outcomes between contexts and subgroups. Such analyses have shown how a Welsh intervention aimed at increasing physical activity worked for patients at risk of coronary heart disease but not for patients referred for mental health reasons.Reference Murphy, Edwards, Williams, Raisanen, Moore and Linck9
This approach has, however, been roundly dismissed as antithetical to realism.Reference Van Belle, Wong, Westhorp, Pearson, Emmel and Manzano10 According to realists, the fundamental characteristics and ambitions of RCTs mean that they are wholly unsuitable to (and in principle, designed to obviate rather than embrace) variability in the interactions between interventions, agents and contexts, seeking where possible to minimise the effects of all but the first of these, and thereafter to reduce the other two to variables to be measured. On the other hand, realists, despite framing programme theories as testable hypotheses, might be accused of avoiding refutation by insisting on the primacy of (further) theorising.
Although scientific psychiatry has always been proudly pragmatic and eclectic, the question we now face is whether these two epistemologies (positivism and realism) can ever be reconciled and harnessed in a common evaluative approach. With some justification, proponents of realism object to this approach being subverted by the positivism and emphasis on outcomes simplified to average effects inherent in RCTs. Realist evaluation, by contrast, is about understanding mechanisms of change based in infinitely complex (and changing) interactions between agents and contexts, and for many, never the twain shall meet.
RCTs have significant limitations when it comes to evaluating complex and context-dependent interventions, of the type that are common in mental healthcare. By focusing on causal mechanisms and adopting a more pragmatic attitude to evidence, approaches based on realism offer potential for creating forms of evidence that are beyond the scope of positivist clinical trials. Realist RCTs are not, it seems, for everyone, and whether this approach is worth pursuing (and investing public research funds) will ultimately come down to a question of utility (and value of information) in the generation of evidence to inform policy and healthcare commissioning decisions.
1Deaton, A, Cartwright, N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med 2017, in press.
2Priebe, S, Slade, M. Evidence in Mental Health Care. Brunner-Routledge, 2002.
3Carey, TA, Stiles, WB. Some problems with randomized controlled trials and some viable alternatives. Clin Psychol Psychother 2016; 23(1): 87–95.
4Pawson, R, Tilley, N. Realistic Evaluation. Sage, 1997.
5Pawson, R. The Science of Evaluation. Sage, 2013.
6Greenhalgh, T, Wong, G, Westhorp, G, Pawson, R. Protocol-realist and meta-narrative evidence synthesis: evolving standards (RAMESES). BMC Med Res Methodol 2011; 11(1): 115.
7Byng, R, Norman, I, Redfern, S, Jones, R. Exposing the key functions of a complex intervention for shared care in mental health: case study of a process evaluation. BMC Health Serv Res 2008; 8(1): 274.
8Bonell, C, Fletcher, A, Morton, M, Lorenc, T, Moore, L. Realist randomised controlled trials: a new approach to evaluating complex public health interventions. Soc Sci Med 2012; 75(12): 2299–306.
9Murphy, SM, Edwards, RT, Williams, N, Raisanen, L, Moore, G, Linck, P, et al. An evaluation of the effectiveness and cost effectiveness of the National Exercise Referral Scheme in Wales, UK: a randomised controlled trial of a public health policy initiative. J Epidemiol Community Health 2012; 66(8): 745–53.
10Van Belle, S, Wong, G, Westhorp, G, Pearson, M, Emmel, N, Manzano, A, et al. Can ‘realist’ randomised controlled trials be genuinely realist? Trials 2016; 17(1): 313.