Authors have an ethical responsibility to report the study design and results in a manner that enables reproduction of results and assessment of bias. In this paper we discuss approaches for comprehensive reporting in animal welfare studies. Checklists such as the Reporting guidElines For randomized controLled trials for livEstoCk and food safety statement provide guidance for reporting studies. Such standards represent the current minimum for reported standards.
Complete reporting of study conduct and results has always been an important part of the scientific process, however, in recent years there has been a renewed focus on the importance of complete and accurate reporting. Driving forces behind this focus include (1) an increased scrutiny of scientific findings, (2) the manner in which scientific information is applied to the decision-making process and (3) concerns over wastage of animals and resources used in research endeavors (O’Connor et al., Reference O’Connor, Sargeant, Gardner, Dickson, Torrence, Consensus Meeting, Dewey, Dohoo, Evans, Gray, Greiner, Keefe, Lefebvre, Morley, Ramirez, Sischo, Smith, Snedeker, Sofos, Ward and Wills2010; Sargeant and O’Connor, Reference Sargeant and O’Connor2013; Ioannidis et al., Reference Ioannidis, Greenland, Hlatky, Khoury, Macleod, Moher, Schulz and Tibshirani2014). The increased use of formal research synthesis techniques, such as risk assessment, systematic reviews and meta-analysis, in the decision-making process of public policy makers and for regulatory purposes also places greater importance on the incorporation of primary research into these methods. These explicit uses of research data have led to efforts that ensure accurate estimates of the magnitude of the effect and that potential for biases are incorporated into research synthesis techniques. If studies are incompletely reported, then the results may not be useable for secondary purposes, and the financial resources are wasted and the ethical value of the animals is unappreciated. In order to avoid waste of recourses and to appropriately recognize the ethical value of animal research subjects, authors have an ethical obligation to provide as complete and as accurate a report as possible and editors and peer-reviewers have an obligation to ensure that the authors do so.
A common research question used for policy development is the assessment of interventions designed to mitigate an adverse outcome. Recently developed guidelines exist for identifying what a complete account of an intervention assessment study in animal studies represents (Kilkenny et al., Reference Kilkenny, Browne, Cuthill, Emerson and Altman2010; O’Connor et al., Reference O’Connor, Sargeant, Gardner, Dickson, Torrence, Consensus Meeting, Dewey, Dohoo, Evans, Gray, Greiner, Keefe, Lefebvre, Morley, Ramirez, Sischo, Smith, Snedeker, Sofos, Ward and Wills2010).
We are unaware of other studies that have assessed the completeness of reporting in studies focused on interventions for animal welfare outcomes. The primary objective was to assess completeness of reporting interventions designed to mitigate pain in neonatal piglets undergoing routine management procedures. Our second objective was to illustrate how authors can report the items recommended by a single/uniform reporting guideline framework using examples from existing animal welfare science literature (Dawkins, Reference Dawkins2006). We sought to identify aspects of study design, analysis and results that were inadequately reported and provide examples so that education of animal welfare science researchers could be targeted to improve reporting in the future.
Material and Methods
This project used literature identified for a systematic review to identify research gaps and develop recommendations related to pain mitigation in the neonatal piglet undergoing castration, tail docking or ear notching (National Research Council (US) Committee on Recognition and Alleviation of Pain in Laboratory Animals, 2009; Dzikamunhenga et al., Reference Dzikamunhenga, Anthony, Coetzee, Gould, Johnson, Karriker, McKean, Millman, Niekamp and O’Connor2014; O’Connor et al., Reference O’Connor, Anthony, Bergamasco, Coetzee, Gould, Johnson, Karriker, Marchant-Forde, Martineau, McKean, Millman, Niekamp, Pajor, Rutherford, Sprague, Sutherland, von Borell and Dzikamunhenga2014). Details about the protocol, search, screening process to identify relevant studies and resulting review are available elsewhere (Dzikamunhenga et al., Reference Dzikamunhenga, Anthony, Coetzee, Gould, Johnson, Karriker, McKean, Millman, Niekamp and O’Connor2014; O’Connor et al., Reference O’Connor, Anthony, Bergamasco, Coetzee, Gould, Johnson, Karriker, Marchant-Forde, Martineau, McKean, Millman, Niekamp, Pajor, Rutherford, Sprague, Sutherland, von Borell and Dzikamunhenga2014). For the assessment of comprehensive reporting, we used the studies relevant to the original review. The unit of concern for reporting was a study/trial. Two or more studies/trial were occasionally reported in a single article. An intervention study/trial must have at least two arms (treatment groups).
Reporting consistent with REFLECT (Reporting guidElines For randomized controLled trials for livEstoCk and food safety) guidelines
The REFLECT statement is a reporting guideline for randomized controlled trials that assess interventions for food-producing animals such a swine and is therefore suitable for this topic area (http://www.REFLECT-statement.org) (O’Connor et al., Reference O’Connor, Sargeant, Gardner, Dickson, Torrence, Consensus Meeting, Dewey, Dohoo, Evans, Gray, Greiner, Keefe, Lefebvre, Morley, Ramirez, Sischo, Smith, Snedeker, Sofos, Ward and Wills2010; Sargeant et al., Reference Sargeant, O’Connor, Gardner, Dickson and Torrence2010). The REFLECT statement comprises 22 checklist items (Table 1), of which we assessed the reporting of 17. The rationale for including these items in a publication is provided by Sargeant et al. (Reference Sargeant, O’Connor, Gardner, Dickson and Torrence2010). The reporting of five REFLECT checklist items was not assessed (Table 1). We did not assess study flow (REFLECT checklist item 13) because we expected that studies relevant to the interventions were of such short duration that it was unlikely any loss-to-follow-up would occur, that is, few piglets would leave the study because the outcome could be assessed. We did not assess REFLECT checklist items 2 (Introduction and Background), 20 (Discussion and Interpretation), 21 (Generalizability) and 22 (Overall Evidence) because they are more prone to subjective assessment.
REFLECT=Reporting guidElines For randomized controLled trials for livEstoCk and food safety.
The following were not assessed, Introduction (REFLECT item 2), Study flow (REFLECT item 13) Discussion (REFLECT item 20), Generalizability (REFLECT item 21) and Overall evidence (REFLECT item 22).
For REFLECT checklist item 3 (Methods and Participants), we extracted the country in which a study was conducted if it was explicitly reported in the article. Otherwise, the reviewer scored location as ‘not reported’ and the item was ‘partially reported.’ For REFLECT checklist item 5 (Objectives) to be considered ‘completely reported,’ the objectives had to be associated with a hypothesis that related to the outcomes. For REFLECT checklist item 6 (Outcomes), we considered for studies that assessed only one outcome that this was the primary outcome. Otherwise, we expected the authors to designate a primary outcome or this checklist item was considered ‘incompletely reported.’ We also added one item to assess if the studies reported random allocation to group. This was necessary because the REFLECT statement makes the a priori assumption that studies are randomized. Based on the assumption that the study is randomized, the REFLECT asks for information about the steps in the randomization approach for assessment of its validity. That is, sequence generation, allocation concealment and implementation. If a study does not randomize to group, then the steps of randomization will not be reported and listed as missing from the report.
We assessed the reporting of statistical analyses (REFLECT checklist item 12) using the guidelines by Lang and Altman (Reference Lang and Altman2014). We considered statistical analyses fully reported if all of the following were provided:
1. a full description of the main methods for analyzing the primary and/or secondary objectives of the study;
2. clear methodology used for each analysis, rather than just listing in one place all the statistical methods used;
3. confirmation that data conformed to assumptions of the test used to analyze them. In particular, if the analyses specified that (1) skewed data were analyzed with non-parametric tests, (2) paired data were analyzed with paired tests and (3) the underlying relationship analyzed with linear regression models was linear;
4. whether and how any allowance or adjustments were made for multiple comparisons (performing multiple hypotheses tests on the same data) when the reported results suggested such adjustment was necessary. For example, when studies reported comparison of multiple time points or trials with 3+ trial arms in the results we expected a report of the approach to adjusting for such pairwise comparisons, that is, Tukey’s, Bonferroni’s, etc. If authors did not report the approach, but did report that adjustment was conducted, this was considered ‘complete reporting’;
5. for t-tests only, whether tests were one- or two-tailed and justification for the use of one-tailed tests;
6. description of the α level (e.g., 0.05) that defined statistical significance;
7. the name of the statistical package or program used in the analyses. In this situation we considered reporting complete even if only the program, rather than the package, was reported, that is, both SAS® and SAS® PROC MIXED were considered ‘complete reporting.’
If at least one but not all of the above were reported, then we considered statistical analyses ‘partially reported.’
The presence or absence of each REFLECT checklist item was independently evaluated by two reviewers. Disagreements were initially resolved by one of the reviewers. Where there was disagreement between reviewers about the presence of a checklist item, one reviewer would re-evaluate the article. If this approach did not resolve the conflict, then the item was discussed with a third reviewer. As with any assessment of comprehensive reporting, quality assessments were not made. For example, we did not assess if the method used to allocate piglets to treatment groups reduced bias, rather we assessed if the approach to allocation was reported.
Reporting of procedures, trial characteristics, study design features and summary measures
REFLECT checklist items are very general, and as some sources of heterogeneity are domain specific, we also determined if specific aspects of some checklist items were reported. We specifically assessed if the following were reported: type of production system (i.e., all in/all out or continuous flow or not reported), and facility types where the research was conducted (i.e., university-owned farm or laboratory/research facility or privately owned/commercial operation or not reported). We extracted specifics about the reporting of the interventions. We also evaluated reporting of descriptors of the study design: number of animals enrolled in the trial, and number of animals enrolled in trial arms. The inclusion in the report of statistical descriptions of the outcomes, including effect sizes and measures of precision were also evaluated.
A total of 622 articles were identified by original search and of those, 52 studies from 40 articles met the eligibly criteria for the review and were eligible for assessment of the approach to reporting (Dzikamunhenga et al., Reference Dzikamunhenga, Anthony, Coetzee, Gould, Johnson, Karriker, McKean, Millman, Niekamp and O’Connor2014; O’Connor et al., Reference O’Connor, Anthony, Bergamasco, Coetzee, Gould, Johnson, Karriker, Marchant-Forde, Martineau, McKean, Millman, Niekamp, Pajor, Rutherford, Sprague, Sutherland, von Borell and Dzikamunhenga2014). All the studies were experimental and therefore should have been randomized trials; no relevant cohort studies were identified. The characteristics of the studies assessed are provided in Supplementary Table S1. A summary of the completeness of reporting of items from the REFLECT checklist is shown in Table 1. No single study reported all of the REFLECT checklist items evaluated in this analysis. None of the studies assessed, reported the selection criteria for farms or animals, the approach to allocation to group, the sample size rationale, complete description of statistical methods, baseline data by group for animals enrolled, complete description of the results, information about ancillary analyses or the occurrence of adverse events by group. Other checklist items were only reported by some of the studies (Table 1).
The reporting of the information that would enable end-users to understand the relevance of the study population to a target population was poor. Often, eligibility criteria for the farms and animals used were missing. The frequency of reporting country of conduct and study setting is shown in Supplementary Table S1.
Specific intervention information (REFLECT checklist item 4) was reasonably well reported; all studies provided at least some information about the interventions assessed. Supplementary Table S2 provides reporting examples for the studies that assessed non-steroidal anti-inflammatory drug interventions. In the interest of space the other interventions are not included. In Table 2 we provide a simple summary of basic outcome measures: means (or proportions) and measures of precision and trial arm sample size; frequently this information was not reported. In Table 3, Table 4 and Supplementary Table S2 we provide examples where the REFLECT items were well reported from the studies included in the review. In a few situations, no examples could be found in the 52 studies and examples were drawn from other animal studies. Table 3 focuses on the description of the methods and materials, while Table 4 focuses on presentation of the results. The material in the Supplementary Table S3 relates to the introduction and discussion in a manuscript. The three tables should be used together when preparing a manuscript.
NSAID=non-steroidal anti-inflammatory drugs; REFLECT=Reporting guidElines For randomized controLled trials for livEstoCk and food safety.
Examples for the following not included: Introduction (REFLECT item 2), Study flow (REFLECT item 13) Discussion (REFLECT item 20), Generalizability (REFLECT item 21) and Overall evidence (REFLECT item 22).
* For more details of exact outcomes measured refer to Dzikamunhenga et al. (Reference Dzikamunhenga, Anthony, Coetzee, Gould, Johnson, Karriker, McKean, Millman, Niekamp and O’Connor2014).
REFLECT=Reporting guidElines For randomized controLled trials for livEstoCk and food safety.
* Example not selected from the study set.
REFLECT=Reporting guidElines For randomized controLled trials for livEstoCk and food safety.
* Example not selected from the study set.
Reporting of REFLECT items that relate to objectives and hypotheses
In the remaining part of the manuscript, we discuss the rationale for a select few REFLECT checklist items so authors are aware of how the information is used by readers; however, a full explanation of the rationale for each REFLECT item is available in Sargeant et al. (Reference Sargeant, O’Connor, Gardner, Dickson and Torrence2010).
Although the objective of the study and sometimes a secondary objective were often provided, very few studies translated the objective into a testable hypothesis that included the metric to be measured (REFLECT checklist item 5). Translating the objective to a hypothesis with a specific metric is important because some metrics may be more valid for specific objectives than others. Therefore, knowing the exact metric that will be tested is important. For example, an objective of a study may be to assess the impact of the intervention on pain mitigation, and this would be assessed using a comparison of the mean frequency (Hertz) of vocalizations in piglets receiving the anesthetic intervention compared with the mean Hertz of vocalizations in piglets without the anesthetic, that is, H0=mean1−mean2=0. Clarification of the hypothesis ensures that the end-user knows which metric is being used to assess the objective, and should facilitate identification of the primary outcome.
Reporting of REFLECT items that relate to outcomes and sample size
A clear description of which outcomes were primary or secondary was never explicitly reported by authors who assessed multiple outcomes (REFLECT checklist item 6). The only studies that received a ‘yes’ for this item reported only one outcome. Another item poorly reported was the primary outcome. Knowledge of the primary outcome is necessary to assess the power of the study. By definition the primary outcome is that used to determine the sample size, so authors need to state the sample size rationale so it is clear what the primary outcome is. Many hypothesis-testing studies, including welfare oriented studies, have multiple outcomes of interest, and in these circumstances the authors should power the study to the outcome with the largest sample size upon which they propose to conduct hypothesis testing and make inference. If the authors instead intend to test the hypotheses about multiple outcomes jointly in a multivariate analysis and make the inference about the joint outcomes, then methods of sample size calculation are available and this can be acknowledged in the report (Huang et al., Reference Huang, Woolson and O’Brien2008; Luo, Reference Luo2014). However, such an approach to analysis is very rare (Kerr, Reference Kerr1998). Unless explicitly declaring that a study is a pilot or making use of animals used for another purpose, assessments of interventions should be hypothesis driven. The hypothesis should be specific enough to enable determination that the number of animals enrolled should be sufficient to enable detection of a clinically meaningful difference in the outcome. Researchers therefore should prospectively design and justify the sample size, which requires knowledge of the primary outcome. Further, if authors do not have an a priori hypothesis about a primary outcome, the potential to ‘data mine’ for a statistically significant outcome and selective reporting bias is high.
No studies reported the rationale for the sample size (Checklist item 7). This was surprising, as all studies seemed to purposefully assess the effect of an intervention on an outcome and, therefore, the number of animals needed to detect the magnitude of effect of interest is a prerequisite step in study design. Although reduction of animals included in studies is an important principle of animal research, this concept does not negate the need for sufficient power to detect clinical meaningful changes in the outcome. There are numerous papers devoted to the need for adequately powered animal studies (Cohen, Reference Cohen1997; Chapman and Seidel, Reference Chapman and Seidel2008). Some would argue that reporting the sample size rationale based on an a priori determined primary outcome is not necessary if the P-values are reported as these indicate the probability of a type 1 and type 2 error. However, knowledge of a priori power is not the only rationale for reporting sample size rationale and the primary outcome, such reporting also guards against authors reporting a different primary outcome based on the results of analyses, a practice colloquially known as HARKing (Hypothesis After the Results are Known) (Kerr, Reference Kerr1998). A practice that might be quite possible, in the area such as animal welfare, where multiple outcomes are of interest.
Reporting of REFLECT items that relate to confounding: allocation to group/randomization
REFLECT checklist items 8 through 10 (Sequence Generation, Allocation Concealment, and Implementation, respectively) are based on the assumption that the study is randomized. A description of the method of developing the randomization for the sequence generation, allocation concealment and implementation, was not provided in any study. A total of 33 of 52 studies used the term ‘randomly’ or ‘randomized’ or ‘random’ in their description of piglet allocation to treatment group. Occasionally it was unclear if the approach used was truly random, despite a description as such. For example, one study described randomly assigning 245 clinically healthy piglets to one of the 12 experimental groups. However, the sample sizes in each of the seven relevant arms were very different, suggesting a method other than random allocation. Several studies reported using restrictions of randomization. Blocking by continuous covariates or stratification by categorical covariates was reported in 39 studies. Covariates used were weight, litter, weight and litter, sow or weight, or litter and adoption. No study that controlled for weight using blocking explicitly reported the size of the block. Details about the approach to allocation are part of reporting that enables assessment of internal validity as they relate to the exchangeability of groups. If it cannot be determined that groups are exchangeable then it is unclear if the observed differences can be attributed to the intervention. Furthermore, without details of the randomization approach, approaches that are haphazard (lacking any obvious principle of organization) or convenient may be incorrectly reported as random. The importance of random allocation is highlighted by authors of the CONSORT statement which we quote here ‘Random assignment is the preferred method; it has been successfully used regularly in trials for more than 50 years. (reference in original text) Randomisation has three major advantages (reference in original text). First, when properly implemented, it eliminates selection bias, balancing both known and unknown prognostic factors, in the assignment of treatments. Without randomisation, treatment comparisons may be prejudiced, whether consciously or not, by selection of participants of a particular kind to receive a particular treatment. Second, random assignment permits the use of probability theory to express the likelihood that any difference in outcome between intervention groups merely reflects chance. (reference in original text) Third, random allocation, in some situations, facilitates blinding the identity of treatments to the investigators, participants, and evaluators, possibly by use of a placebo, which reduces bias after assignment of treatments. (reference in original text) Of these three advantages, reducing selection bias at trial entry is usually the most important. (reference in original text)’ (Moher et al., Reference Moher, Hopewell, Schulz, Montori, Gotzsche, Devereaux, Elbourne, Egger and Altman2010). As many welfare studies are small, it is reasonable that authors would employ restricted randomization tools such as stratification and blocking to increase the power of studies. Regardless of the approach to randomization, it should be described fully so that end-users can assess the potential for bias.
Reporting of REFLECT items that relate to performance and measurement/information bias-blinding
Of the 52 studies, 18 reported blinding as part of their protocol; however, none provided a full description of the approach used to blind the study (REFLECT checklist item 11). Blinding, whether for allocation of treatments or interventions or assessment, was infrequently reported by authors. As blinding is designed to reduce measurement/information bias, it is important to know if outcome assessment is biased. There is some evidence in animal welfare that absence of blinding is associated with more positive outcomes (Tuyttens et al., Reference Tuyttens, de Graaf, Heerkens, Jacobs, Nalon, Ott, Stadig, Van Laer and Ampe2014).
Reporting of REFLECT items related to statistical methods
Statistical methods (REFLECT checklist item 12) were not reported in eight studies. In the remaining 44 studies, statistical methods were considered partially reported because they failed to meet all the criteria described above. Assessment of comprehensive reporting of statistical methods is very difficult; the measure of comprehensiveness is that a reasonably informed individual would be able to assess the validity, although what is ‘reasonable’ might appear itself subjective. We would encourage authors to consult with documents published previously that describe what should be included in a description of statistical methods (Lang and Altman, Reference Lang and Altman2014).
Reporting of REFLECT items related to setting, study population characteristics
Dates relevant to the study recruitment and performance were described in only six studies (REFLECT checklist item 14). Although it is difficult to envision how year or season could affect the response of piglets to pain mitigation, such information is very relevant for other topics, especially those that seek to understand the influence of season or year on an outcome. The same principle can be inferred for study location (i.e., country or region or production system).
Baseline demographics and clinical characteristics of each group were generally poorly reported (REFLECT checklist item 15). When weight and age information were presented as summary measures for all enrolled pigs together, we considered this to be partially reported. It is recommended that authors provide demographic information about the groups separately, so that end-users can assess if the groups are comparable, especially given the absence of reporting of allocation methods. Demographic information was frequently reported in the Methods section and not explicitly in the Results section. REFLECT and other statements make the distinction that the methods and materials could, and potentially should, be written before the study is started, therefore the demographic information of the study groups such as the mean age and mean weight (including standard deviations) are a result and should be presented in the Results section.
Reporting of REFLECT items related to results of analyses
The actual number of piglets that contributed to data analyses (REFLECT item 16) was frequently not reported. Presumably authors felt that reporting the number of enrolled animals would suffice because the potential for loss-to-follow-up in the subject matter studied was low. For this topic area, this assumption may be valid and failure to report that no loss-to-follow-up occurred may not be a source of bias. Sometimes the unit of analysis was not the same as the number of animals in the study. This was particularly important for the behavior data, which could be reported as number of pigs that demonstrated an activity or the number of time periods when an event was observed. These clearly have different denominators. Similarly, some outcomes appeared to be measured only on a subset of enrolled animals, perhaps because testing all animals was time consuming or expensive. Supplementary Figure S1 is an excerpt of a table (Sutherland et al., Reference Sutherland, Davis, Brooks and Coetzee2011) that provides the number of animals included in the analysis.
Effect measures regarding outcomes were often poorly reported. Supplementary Table S3 provides examples of the information missing from some studies. Such information would be needed to assess the magnitude of effect so that the balance of benefits and harms could be evaluated (which cannot be evaluated by P-values). If only the P-value is reported, it is not possible to know the magnitude or direction of the effect (i.e., whether the intervention increased or decreased the outcome). Furthermore, measures of variation were often not reported or not reported clearly, especially in figures where it was not always possible to discern if the error bar represented a SEM, a SD or a confidence interval. In studies that used random effects variables to control for clustering, the variance components were never reported, despite their importance for future study design and interpretation.
Ancillary analyses (REFLECT checklist item 17, Outcomes and Estimation) were not reported in any study, as no a priori primary and secondary outcomes were reported and no sample size justifications were provided. The rationale for reporting ancillary analyses is to give end-users knowledge of potentially interesting results that arise from data exploration and are therefore hypothesis generating rather than hypothesis testing.
Proactive reporting of adverse events was expected in order for end-users to balance the benefits and harms of using pain-mitigating interventions in neonatal piglets. Harms are often rare and therefore often only detected using secondary analyses. Such information would have allowed us to understand whether the reported mortality rate was excessive compared with baseline trends in production. Sometimes adverse events were reported in a way that we could not determine the group to which the animals that experienced the intervention were allocated. Knowledge of the group to which the animal was assigned is vital to interpreting harms. For example, reporting that 10 animals died in the study is not informative, compared with reporting five animals died in each group or one animal died in the control group and nine in the treated group.
In the area of animal welfare research we found that, as in other disciplines related to veterinary and animal sciences, reporting of intervention studies was frequently incomplete (Burns and O’Connor, Reference Burns and O’Connor2008; Sargeant and O’Connor, Reference Sargeant and O’Connor2013). Overall, many studies failed to report information that would be needed to assess internal and external validity. There are both ethical and, in some countries, legal reasons for ensuring that scientists using animals must not only adhere to adequately justified methodology but that they should also be able to articulate it according to high reporting standards to their peers and the public. The privilege given to scientists to use research animals entails adherence to rigorous reporting standards that help to ensure compliance with national and international policies that protect the welfare of all research animals.
Some of the journals we have assessed might not be considered as truly ‘scientific’ as these (mostly national/local) journals are periodical magazines intended to inform practitioners on new developments. As an example, the journal ‘Der Praktische Tierarzt’ has a different audience than Journal of Animal Science. However, if such journals do choose to publish primary research then it seems that the standards of reporting would still apply. Another reason for omissions may be lack of awareness of the need for comprehensive reporting due to the multidisciplinary nature of many projects.
Publication of the results of a scientific study is not the end of the scientific process (Sargeant and O’Connor, Reference Sargeant and O’Connor2013). Presumably, researchers publish with the intent that the results of a study will enable generation of new hypotheses, validate current hypotheses, or influence decision making. These secondary uses of primary data rely on the validity of the original study design, analyses and such assessment of validity can only be determined if the report is transparent, accurate and comprehensive (O’Connor et al., Reference O’Connor, Sargeant, Gardner, Dickson, Torrence, Consensus Meeting, Dewey, Dohoo, Evans, Gray, Greiner, Keefe, Lefebvre, Morley, Ramirez, Sischo, Smith, Snedeker, Sofos, Ward and Wills2010; Sargeant and O’Connor, Reference Sargeant and O’Connor2013). Further, if incomplete reporting casts doubt over the results of studies then the monetary and ethical value of the animal and financial resources used to generate the data were not fully realized. In addition, animals may have suffered unnecessarily. If the study is considered important enough that the information is needed for decision making, it may even be necessary to repeat the study (Ioannidis et al., Reference Ioannidis, Greenland, Hlatky, Khoury, Macleod, Moher, Schulz and Tibshirani2014; Macleod et al., Reference Macleod, Michie, Roberts, Dirnagl, Chalmers, Ioannidis, Al-Shahi Salman, Chan and Glasziou2014). In situations where the reporting is so incomplete that useful data cannot be extracted from the original experiment and the study must be repeated, this would be incongruous with the 3Rs (replacement, reduction and refinement) and not be a good use of already scarce research funding. Professionally, the credibility of the authors of the original study could be called into question.
Some might argue that authors, reviewers and editors should be able to ‘cherry pick’ checklist items from REFLECT or other applicable guidelines that they think are relevant. However, two concepts argue against this idea. The first is that the current body of work was developed under such a system and is lacking in important details of studies. The second reason is that authors, reviewers and editors are not necessarily able to anticipate all the end uses of research. The alternative is to use guidelines that have been developed by a diverse group of developers and end-users, which reduced the need for second guessing what end-users may need. For example, in this data set it might be argued that country is not relevant to pain, however, country may be relevant to a study looking at the use of placebos in animal welfare trials and knowing the country will help the researcher of that question understand differences in placebo use. Clearly authors would not have anticipated such a question, however, consensus groups often have. In human clinical trial reporting, the concept of cherry picking items to report from guidelines is not encouraged. For example, the editorial policy for Nature indicates ‘Manuscripts reporting results of a clinical trial must conform to CONSORT 2010 guidelines.’ Because numerous guidelines exist and guidelines can change, authors should cite the guideline they used in preparing their report. It is true that precise writing is needed to add the detail requested by guidelines, however, this is likely not a meaningful barrier, as indicated by the widespread adoption of reporting guidelines in human health including high-impact journals such as Nature and Lancet where space is at a premium.
It is unclear why reporting is incomplete. Some might suggest that this is because of lack of awareness of reporting guidelines. However, the concepts of reproducible research and reporting in a manner that reflects the experiment is not new or novel, so lack of awareness cannot explain all of the incomplete reporting (Grindlay et al., Reference Grindlay, Dean, Christopher and Brennan2014).
It is imperative that research reporting be complete to enable reproducibility, assessment of the internal and external validity of the study and knowledge translation. Given that animal welfare science is a discipline that often involves interventions that may be perceived as unpleasant to the animal, attention to the quality of reporting is especially critical to advance the field. Comprehensive reporting is an ethical responsibility for researchers undertaking this type of research. For intervention studies, the reader should be able to understand the magnitude and precision of the estimated effect of the intervention, and the probability that the effect is consistent with the null hypothesis. The reader should also be able to assess the potential for bias.
We would encourage authors to consider using reporting guidelines to improve reporting. Consistent with the 3Rs, in particular the reduction principle, using reporting guidelines can maximize information from the animals used in the study and minimize the risk of unnecessary studies, therefore reducing further animal use. We are aware that the omission of this information as well as important design characteristics, analyses or results is often unintentional. In addition, we are well aware what constitutes a complete report is not a static list. As knowledge and technology change, the standards for how science is conducted and reported should be expected to change. Given these changing standards, however, the most recent checklists represent minimum current standards. This would not preclude authors from including or editors and peer-reviewers from requesting additional information. Checklists provide guidance for reporting but researchers should adhere to the underlying reporting principles to provide a report that facilitates reuse of the data and enables assessment of bias. With the growing frequency of multiple collaborators involved in manuscript preparation, the final editor may not be aware of all the aspects required for reporting. One reason for an incomplete report might be a lack of knowledge of what and how items should be reported. However, resources are becoming increasingly available to mitigate this problem. Documents specific to animal studies like the REFLECT statement for livestock trials and the ARRIVE guidelines specific to uses of animals undergoing an experimental procedure in a research laboratory or formal test setting (Kilkenny et al., Reference Kilkenny, Browne, Cuthill, Emerson and Altman2010; O’Connor et al., Reference O’Connor, Sargeant, Gardner, Dickson, Torrence, Consensus Meeting, Dewey, Dohoo, Evans, Gray, Greiner, Keefe, Lefebvre, Morley, Ramirez, Sischo, Smith, Snedeker, Sofos, Ward and Wills2010). The ARRIVE guidelines are designed for animals used in experimental settings with a focus on animal populations where the independence assumption is valid. The REFLECT statement is more specific for livestock and provides more focus on non-independent populations such as occur in housed animals. As reporting guidelines are relatively new, the impact on reporting has not been assessed yet. For example, REFLECT was unavailable when many of the papers in this review were published. The standards of reporting observed here are therefore not reflective of the impact of reporting guidelines. The examples provided in Table 3, Table 4 and Supplementary Table S3 can be used as a guideline for how some of the studies we reviewed effectively reported the information requested by the checklist. All three tables should be used together, and are broken into sections here for presentation purposes. Use of a reporting checklist might help reduce the number of items not reported.
The overall conclusion after assessing these studies using REFLECT, is that there is (1) an opportunity to improve the reporting and (2) a need to raise awareness of the importance in providing a complete report of how animal welfare studies are conducted. The continued ethical and legal acceptability of using animals is contingent upon accurate and complete reporting. Accurate and complete reporting, in most cases, relates to both high quality research and responsible conduct in animal research.
This project was funded by the National Pork Board (NPB) Grant no. 12-186. The authors have no conflicts of interest to declare.
For supplementary material/s referred to in this article, please visit http://dx.doi./org/10.1017/S1751731115002323