Systematic literature reviews are useful tools for researchers, policy makers and health care providers. They help to integrate information and provide insight concerning the consistency of scientific findings Reference Mulrow. This synthesis of information is helpful in clinical decision-making, economic evaluations and future research directions Reference Moher, Cook, Eastwood, Olkin, Rennie and Stroup. It is therefore important to use a very strict methodology in systematic reviews. If, however, the studies included in such reviews are flawed, the value of the reviews is open to question Reference Jüni, Altman and Egger. To avoid this pitfall, systematic reviews of randomized control trials (RCTs) should assess the quality of the studies included in terms of four main types of bias: selection, performance, detection and attrition, and also evaluate possible publication bias taking all the studies included into account Reference Clarke and Oxman.
Publication bias is determined by the nature and direction of the study results; in other words, studies lacking positive results are usually less likely to be published Reference Clarke and Oxman. Scientists tend to publish their positive results more than their negative findings Reference Dickersin and Rennie. Furthermore, since studies with non-significant results are difficult to publish in English language journals, such studies are likely to be published in journals in other languages: English language bias [6,8].
In consequence, if this type of bias is not addressed when undertaking a systematic review, the final results may be affected by creating an overestimation of the true effect of an intervention. For some authors, this is particularly the case if the review includes only published studies Reference Clarke and Oxman, since just because negative studies have not been identified from amongst published studies it does not necessarily mean that such studies do not exist.
To control or minimize this type of bias, the authors of systematic reviews should undertake comprehensive searches of medical databases such as Medline, Embase or CENTRAL (Cochrane Library). The search strategies aim to obtain a trade-off between sensitivity and precision. To expand on the findings from medical databases (electronic searches), these authors often undertake searches of what is called the ‘grey literature’. Grey literature is literature that has not been formally-published and that may include any material ranging from abstract reports to unpublished data Reference Hopewell, McDonald, Clarke and Egger, such as personal communications or reports and advertisements from pharmaceutical or technological companies. The registers of ongoing trials available on the Internet are also searched  and finally, authors of identified trials may be contacted when possible for additional information about other relevant studies.
In spite of the importance of grey literature as seen above, this paper attempts to show, in the context of psychiatry, the consequences (and risk) of generalizing the implications of this literature in the control of publication bias, as has been supposed by a recent systematic review Reference Hopewell, McDonald, Clarke and Egger. We also attempt to show the need to differentiate between negative and positive data in the grey literature and how this literature, may introduce bias rather than prevent it due to its quality.
To demonstrate how confusion between grey literature and publication bias may affect the results in meta-analysis of schizophrenia, we repeated the analyses for the same outcome from three Cochrane Library systematic reviews that included both published and grey literature. We applied the same statistical methods as were used in these reviews, but we included only formally-published data (according to review references). These three systematic reviews assessed atypical neuroleptic drugs for schizophrenia [5,10,18]. Our purpose was to demonstrate a homogeneous specific comparison rather than an evaluation of quality and an analysis of conclusions.
2. The risk of generalizing the message that grey literature minimizes publication bias
2.1. Outcome measure
In studies with schizophrenic patients, an important issue is the high number of patient dropouts, either as a result of the pathology itself or due to complications of side effects. This pathology modifies the perception of reality in such a way that even the management of these patients during the study could be perceived as a threat (e.g. paranoid schizophrenia). It seems that information from the outcome “leaving the study early” may indicate that schizophrenic patients improve (perception of disease and the need to take medication). This outcome may also be considered as a good indicator of external validity. We interpreted the outcome “leaving the study early for any reason” as meaning that the higher the number of dropouts from the study, the worse the results.
In one systematic review that addressed the effectiveness of olanzapine versus typical antipsychotic drugs Reference Duggan, Fenton, Dardennes, El-Dosoky and Indran, there was a statistically significant favorable effect using olanzapine compared with using typical antipsychotics of 0.69 (0.51–0.94), P = 0.02; N = 239 subjects for the outcome “leaving the study early—within 3–12months”. In this comparison, the meta-analysis included five studies from the grey literature and only one formally-published study Reference Jakovljevic and Dossenbach. When data from the grey literature (including data for one study provided by the pharmaceutical company) were excluded, the change in number resulted in a non-statistically significant value of 0.69 (0.35–1.37), P = 0.3 for a sample size of 60 subjects from only one study (Fig. 1a). In this example, the inclusion of grey literature did not decrease the effect size found; moreover, analysis including this literature provides even less conservative results. Rather than solve the question for publication bias, therefore, this comparison could be overestimated due to the probable lower quality of this type of literature.
Our second example includes a systematic review, which addressed the efficacy of quetiapine vs. classical antipsychotics Reference Srisurapanont, Disayavanish and Taimkaew. The result of the meta-analysis for the outcome “leaving the study early” for any reason in the short term, found that when data from all of the studies were included, the meta-analysis was statistically significance in favor of quetiapine: 0.87 (0.76–0.99), P = 0.04. However, when the data from the grey literature were excluded, the analysis yielded non-significant 0.87 (0.73–1.02), P = 0.09, even though the sample size was as large as 959 subjects (Fig. 1b). As in the previous review, when data from the grey literature were included ostensibly to avoid publication bias, the effect size of the atypical antipsychotic drug did not decrease or balance out. Moreover, it increases the statistical power of this analysis by accentuating the tendency of positive results shown in formally-published studies. The difference between the previous example and this one is that the latter included a sensitivity analysis, which illustrated the difference between all of the data and only the published data. This recommendation by Cook et al. Reference Cook, Guyatt and Ryan was made to ameliorate the problems arising from including studies from both published and grey literature.
Our third example evaluated a review that compared the effectiveness of risperidone versus several typical antipsychotic drugs Reference Hunter, Joy, Kennedy, Gilbody and Song. Data analyses from 23 studies found that the effect size for the outcome “total leaving the study early (any reason)” was 0.69 (0.59–0.80), P < 0.00001, in favor of risperidone. However, excluding the grey literature, the effect size found was 0.67 (0.56–0.82), P < 0.0001, for 15 studies (Fig. 1c). Although both comparisons were statistically significantly in this meta-analysis, the P-value (by Confidence Interval) was less conservative when data from the grey literature were included.
These examples illustrate that the use of grey literature does not always lead to less publication bias in meta-analysis of schizophrenia. Moreover, in some cases, inclusion of the grey literature may indeed increase the bias. Therefore, it seems necessary to differentiate between negative and positive data in the grey literature. A comprehensive search for systematic reviews in schizophrenia should pay attention primarily to the former since these data have less chance of being published as formal literature. Positive data from the grey literature should be treated differently from positive published data because there may be two reasons for failure to be published: publication delay, or the low quality of the paper. In the latter case, it is difficult to publish formally through the peer-review process, so many papers appear only in the grey literature. Negative results may also be of low quality, but their status as grey literature is at least questionable. Before grey literature is included in a systematic review of schizophrenia it is advisable to request a draft of the complete paper so that its quality can be adequately assessed. Moreover, in some cases the supplier of data is the industry itself and when such data are only positive, we should ask ourselves whether their inclusion decreases the risk of publication bias in a systematic review or, on the other hand, increases it.
A recent systematic review on this topic Reference Hopewell, McDonald, Clarke and Egger has concluded that including grey literature in meta-analysis in order to minimize the risk of bias may have important implications. The review has also found in general that published studies are of higher quality than grey studies, mainly with respect to allocation concealment and blinding. In our opinion this appears to be a contradiction since the method of allocation concealment used and the implementation of a double-blind methodology in RCTs is generally related to the effect size found [[7,17]; that is inadequate methodological reporting correlates with larger estimates of treatment effects Reference Schulz. The review concludes that even in absence of additional information, “data from grey trials are important, particularly when very few trials of a health care intervention have been published”. We believe this is debatable if our intention is to reduce the bias risk in quantitative data by means of greater control with experimental designs (the study quality).
In systematic reviews or meta-analysis of schizophrenia, we should pay attention to study quality, and to the difference between formally-published literature and grey literature. In the former, the authors commit to the steps which must be passed through and there are formal pathways between authors and readers to clarify any doubts or to confirm the quality of the criteria in the publication of the study. The latter, on the other hand, may be simply an abstract that has not undergone peer review, is very cryptic, and has little detail.
The rules of the CONSORT Statement Reference Moher, Schulz and Altman were established to improve the quality of RCT reports. They are followed by the main biomedical journals as one of the main requirements for accepting trials. These recommendations insist on important aspects of RCT quality, such as the generation of the randomization sequence, the method of allocation concealment used and how sample size was determined. Such aspects are difficult to find in the grey literature and lead to complicated interpretations of study quality. This in turn increases the risk of bias. In addition, in many cases only preliminary results are published in the grey literature, so it seems inappropriate to include such results in a systematic review in an equal way to data from the formally-published literature, especially when both results are positive. It would also seem extremely dangerous to make an evidence-based decision concerning some clinical practice based on a hypothetical sum of grey literature showing positive results.
In summary, studies from grey literature should be included in systematic reviews of mental health only when it is evident that they have not been published due to the nature and direction of the study results. Clearly, the best check and balance when evaluating grey literature is to use the same criteria applied to formally-published RCTs. This should assure adequate safeguards against the application of faulty results in the clinic.
We thank Carolyn V. Newey and William Stone for their useful comments and editing assistance. This paper was partially supported by a grant from the Ministerio de Sanidad y Consumo, Instituto de Salud Carlos III (Project no. 01/10004).