Author response

FEATURE ARTICLE POINT OF VIEW Four erroneous beliefs thwarting more trustworthy research Abstract A range of problems currently undermines public trust in biomedical research. We discuss four erroneous beliefs that may prevent the biomedical research community from recognizing the need to focus on deserving this trust, and thus which act as powerful barriers to necessary improvements in the research process. MARK YARBOROUGH*, ROBERT NADON AND DAVID G KARLIN Introduction Competing interests: The authors declare that no competing interests exist. Funding: See page 8 Reviewing editor: Emma Pewsey, eLife, United Kingdom Copyright Yarborough et al. This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited. In 2014, in an essay titled ‘Why scientists should be held to a higher standard of honesty than the average person,’ a former editor of the British Medical Journal argued that science depends wholly on trust (Smith, 2014). While many in the biomedical research community may quibble over the word ‘wholly’ here, few would dispute his overall point: the public’s confidence is essential to the future of research. According to a noted scholar on the subject, the best way to enjoy trust is to deserve it (Hardin, 2002). One would hope that the research community is a deserving case, given the existence of safeguards such as professional norms, regulatory compliance and peer review. Unfortunately, there is an ever-growing body of evidence that calls into question the effectiveness of these measures. This evidence includes, but is by no means limited to, findings about underpowered studies (Ioannidis, 2005), routine overestimations of efficacy (Sena et al., 2010; Tsilidis et al., 2013), the failure to take prior research into account (Robinson and Goodman, 2011; Lund et al., 2016), a propensity to confuse hypothesis-generating studies with hypothesis-confirming ones (Kimmelman et al., 2014), a worrisome waste of resources (Chalmers and Glasziou, 2009), and the low uptake of critical reforms meant to improve research (Enserink, 2017; Peers et al., Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 2014). A recent popular book, Rigor Mortis, synthesizes such evidence into a compelling narrative that casts the reputation of research in a negative light (Harris, 2017). While all of this evidence is cause for concern, we are most concerned by the reluctance of the research community to implement the reforms that could improve research quality. One can imagine a continuum of research practices that impact how scientific understanding advances. At one end one encounters the unforgivable, such as data fabrication or falsification. At the other end one finds the perfect, such as published research reports so thorough that findings can be easily reproduced from them. The concerns of interest to us in what follows have little to do with the misconduct found on the unforgivable end of the continuum. Instead, they fall all along it and pertain to unsound research practices (such as non-robust reporting of methods, flawed study designs, incomplete reporting of data handling, and deficient statistical analyses) that nevertheless impede the advance of science. These are the practices that reform measures could counter if researchers were less reluctant to adopt them. In an effort to account for this reluctance, we review four erroneous beliefs that we think contribute to it. We acknowledge that we lack extensive data confirming the prevalence and distribution of these beliefs. Thus, readers can form their own 1 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research opinions about whether the beliefs are as widespread as we fear they are. We have come upon our concerns as a result of our careers related to biomedical research, which will be the focus of our remarks below, though we think the issues are relevant to life sciences research more broadly. One of us (MY) has extensively studied how to promote trustworthiness in biomedical research, and another (RN) has a long and successful career devoted to understanding the role of sound methodologies in producing it. The final author (DGK) is a preclinical researcher who was among those who pioneered early efforts to learn how researchers and research institutions can meaningfully connect the research community with the publics it seeks to serve. We think this collective pedigree lends credence to our analysis and to the strategy for moving forward that we recommend in the conclusion. Recognizing the barriers to a greater focus on deserving trust It’s about the science, not the scientists Erroneous belief one is that questioning the trustworthiness of research simultaneously questions the integrity of researchers. As a result, many individuals react counterproductively to calls to improve trustworthiness. They are akin to pilots who confuse discussions about improving the flightworthiness of airplanes with criticism of their aviation skills. Though understandable, such concerns miss the point (Yarborough, 2014a). The multitude of methods, materials, highly sophisticated procedures and complex analyses intrinsic to biomedical research all create ways for it to err, making it exceptionally difficult to detect problems (Hines et al., 2014). These are the critical matters that all researchers must learn to direct their attention to. Yet they cannot do so if constructive criticism about how to improve science is taken personally. We need to focus on the health of the orchard, not just the bad apples in it Erroneous belief two is that the bulk of problems in research is due to bad actors. There is no doubt that misconduct is a substantial problem (Fang et al., 2012). This should not blind us, however, to how common study design and data analysis errors are in biomedical research (Altman, 1994). Indeed, these errors are likely Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 to increase due to trends in current scientific practice, particularly the growing size and interdisciplinarity of investigative teams (Wuchty et al., 2007; He and Zhang, 2009; Gazni et al., 2012). Because they require divisions of labor and expertise, such collaborations create fertile ground for producing unreliable research. Affected publications draw much less scrutiny than those of authors who engage in misconduct (Steen et al., 2013), and thus problems in them are likely to be discovered much later, if at all. For example, consider that the number of retracted publications is much less than 1% of published articles (Grieneisen and Zhang, 2012), yet publication bias has been found to affect entire classes of research (Tsilidis et al., 2013; Macleod et al., 2015). The prevalence of erroneous research results and the enduring problems they cause require proactive efforts to detect and prevent them. What we find instead is a disproportionate emphasis on detecting and punishing ‘bad apples.’ The more we concentrate on this, the more difficult it becomes to identify strategies that allow us to focus on what should be seen as more pressing issues. Our beliefs about self-correcting science need self-correcting Erroneous belief three is that science self-corrects. Assumptions that published studies are systematically replicated/replicable, or are later identified if they are not, build resistance against reforms. In theory, reproducibility injects quality assurance into the very heart of research. When one adds other traditional safeguards such as professional research norms and peer review, the reliability of research seems well guarded. However, a growing body of research to check whether scientific results can be reproduced confirms the shortcomings of these safeguards (Hudson, 2003; Allchin, 2015; Banobi et al., 2011; Zimmer, 2011; Twaij et al., 2014; Drew, 2019). We mention just two examples of this research here. The Reproducibility Project: Cancer Biology has been underway for almost five years and originally sought to reproduce 50 critical cancer biology studies (Couzin-Frankel, 2013). The project was scaled back to 18 studies, due largely to costs, but also because important details about research methods were unreported in some of the studies the effort sought to reproduce. As 2 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research right’ or when consensus could emerge, that is no longer the case (Yarborough, 2014b). When errors get corrected, it is more often due to happenstance than any kind of methodical effort for results, of the first 13 completed replication studies, only five produced results similar to the original studies while the other eight produced either mixed or negative results (Kaiser, 2018). An effort to replicate the findings of 100 experimental studies in psychology journals produced a similarly low rate of replication. Only 36% of the original findings were replicated according to the conventional statistical significance standard of p<0.05 for an effect in the same direction (Open Science Collaboration, 2015). Such findings serve as a vivid wake-up call that alerts us to how easily and how often erroneous research results make their way into print, often in leading journals. Once there, they may linger for years or even decades prior to being discovered (if they are ever discovered) (Judson, 2004; Bar-Ilan and Halevi, 2017), and may continue to be cited post-discovery (Steen, 2011). And when errors get corrected, it is more often due to happenstance than any kind of methodical effort (Allchin, 2015). All this is sobering when we consider that erroneous findings can result in potentially dangerous clinical trials (Steen, 2011). Further shaking our confidence in the ability of science to self-correct is how few opportunities there actually are to confirm results. Efforts such as the Reproducibility Project: Cancer Biology notwithstanding, most research sponsors and publishers value, and thus fund and publish, innovative studies rather than research that tries to confirm past findings. And even if sponsors did place higher value on confirmatory studies, the growing complexity of science can make confirmation difficult, or even impossible (Jasny et al., 2011). Besides information about study methods and materials possibly not being available, studies may also use novel and/or highly sensitive/volatile study materials (Hines et al., 2014), impinge on intellectual property rights (Williams, 2010; Godfrey and German, 2008), or deal with proprietary data sets (Peng, 2011). Thus, even if there was a time in science when there were chances ‘to get it Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 Following the rules does not guarantee we are getting it right Erroneous belief four is that compliance with regulations is capable of solving the problems that gave rise to the regulations themselves. Governments, research sponsors and publishers have gone to great lengths to implement reforms that one hopes contribute to deserved trust. But this is true only to a point; one can follow all the rules, extensive though they may be, and still not get it right (Yarborough et al., 2009). We offer efforts to combat research misconduct in the United States as evidence. The United States Congress, following a series of research scandals, issued a mandate for corrective action to combat falsification, fabrication and plagiarism. This eventually led to a program that endures to this day (Office of Research Integrity, 2015), requiring federally funded institutions to investigate allegations of research misconduct. The much larger body of poor-quality science is left completely unaddressed by these government rules. Research shows that about 2% of researchers report engaging in misconduct while fifteen times as many (30%) report having engaged in practices that contribute to irreproducible research (Fanelli, 2009); other studies report even higher percentages (John et al., 2012; Agnoli et al., 2017). Yet, due to the need to follow the rules, resources go overwhelmingly to investigating misconduct. Thus, while such rules bestow quite modest protections to research, they require significant time, energy and money (Michalek et al., 2010), and simultaneously provide a false sense of security that problems are being resolved – when in fact they are not (Yarborough, 2014b). Suggestions to help build cultures and climates that assure deserved trust If we can find a way to shed these erroneous beliefs, we could become more proactive in showing how we deserve the public’s trust. We would not need to start de novo. There are already some proven solutions, as well as promising new recommendations and reforms, that can make inroads on many of the problems identified above. We highlight just a few of them below. Broad implementation of such initiatives could pay valuable dividends. For instance, 3 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research If authors felt safe bringing honest errors to the attention of others, it would encourage much-needed openness about the mistakes that inevitably occur within fields as complex as biomedical research. rather than expend extraordinary resources on investigations of misconduct after it has caused damage (Michalek et al., 2010), we might instead fund empirical studies of both existing and proposed reforms. In consequence, we could determine which reforms are most capable of strengthening the overall health of biomedical research (Ioannidis, 2014). We recognize that the solutions that we highlight below do not do justice to them as a class, but we do believe they constitute a reasonably representative group. Nor do we mean to suggest that they are without controversy. The main point of our essay, however, is not to provide a thorough review of current and proposed reforms and their individual merits. To do so would focus readers’ attention on what changes need to be made in research; our purpose is to explore erroneous beliefs that may prevent sufficient focus on why changes are needed in the first place. Publishing reforms: underway but they could be more ambitious It is encouraging to see that many journals have begun to implement important reform measures. Among the most encouraging is that some now perform rigorous statistical review of appropriate studies, or make such reviews available to peer reviewers or associate editors who request them. Some journals have also modified their instructions to authors in order to improve the reporting of research results. The improved instructions bring transparency to research and aid reproducibility efforts. Recent studies of these modified instructions show that they improve published preclinical study reports, suggesting that even modest journal reforms can work to good effect (The NPQIP Collaborative group, 2019; Minnerup et al., 2016). It should be noted, though, that the benefits of such Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 reforms might be small. A recent study showed that a checklist designed to improve compliance with the ARRIVE guidelines had a quite limited effect (Hair et al., 2018), showing that having helpful tools is no guarantee that they will be used. Thus, it remains unclear what the ultimate impact of such reform measures might be. With this evidence in mind, it would be nice if journals were even more ambitious and took on some more novel recommendations. One example is to consider expanding the taxonomy for correcting and retracting publications so that authors can avoid the current stigma around correcting the scientific record (Fanelli et al., 2018). This would make it possible to take up a 2016 recommendation to reward authors for self-corrections and retractions (Fanelli, 2016). If authors felt safe bringing honest errors to the attention of others, it would encourage muchneeded openness about the mistakes that inevitably occur within fields as complex as biomedical research. Researcher practices: plentiful recommendations with too few takers Publisher reforms can only accomplish so much. Most of the improvements that are required to demonstrate how the research community deserves the public’s trust need to arise from how research is conducted. A wealth of thoughtful recommendations are already in place, but too many are awaiting widespread adoption. Among the most notable are a set of recommendations for increasing value and reducing waste in biomedical research that appeared as part of a series of articles in The Lancet in 2014. Those recommendations center around several needs: to carefully set research priorities; improve research design, conduct and analysis; improve research regulation and management; reduce incomplete or unusable reports of studies; and make research results more accessible (Macleod et al., 2014; Chalmers et al., 2014; Ioannidis et al., 2014; Salman et al., 2014; Glasziou et al., 2014; Chan et al., 2014). The series has not gone without notice, with more than 46,000 downloads of articles in the series within the first year of publication (Moher et al., 2016) and over 900 citations (as of early 2019) in PubMed Central registered articles. Early evidence suggested that the series placed the issues that it addressed on the radar screens of research sponsors, regulators and journals. Disappointingly, academic institutions initially did not seem to pay them much notice (Moher et al., 2016). This reinforces our concern 4 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research that we need to identify what it is about the mindset of so many in the research community that is currently stifling interest in reform. So long as this lack of interest persists, there is little hope that what we consider the highest impact changes will occur anytime soon. We have two such changes in mind that researchers themselves need to take more of the lead on. We need to improve research design and its reporting Researchers need to pay more attention to research methodology, given its central role in establishing the reliability of published research results. Some journals now encourage this behavior by, for instance, requiring that authors complete checklists to indicate whether or not they have used study design procedures such as blinding, randomization and statistical power analysis. Depending on the journal and type of study, modest to substantial gains in reporting prevalence of study design details are achieved when researchers can complete these requirements (The NPQIP Collaborative group, 2019; Hair et al., 2018; Han et al., 2017). Such improved reporting allows for better assessment of the published literature. Better still would be researchers routinely using universally accepted basic procedures. For example, it is widely acknowledged that for animal studies, randomly allocating animals to groups and blinding experimenters to group allocations is required for sound statistical inference (Macleod, 2014). We need to increase data sharing Routine sharing of data should be the new default for researchers, unless there are compelling reasons not to share. Data sharing can, among other things, promote reproducibility, improve the accuracy of results, accelerate research, and promote better risk-benefit analysis in clinical trials (Institute of Medicine, 2013). Despite the growing consensus about the value that data sharing brings to research, we must acknowledge that when and how data sharing should occur remains controversial. As recently noted, “[s]ome argue that the researchers who invested time, dollars, and effort in producing data should have exclusive rights to analyze the data and publish their findings. Others point out that data sharing is difficult to enforce in any case, leading to an imbalance in who benefits from the practice – a problem that some researchers say has yet to be satisfactorily resolved” (Callier, 2019). Given such issues, it comes as no surprise that compliance with Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 journal data sharing policies can be lackluster (Stodden et al., 2018). Taking these difficulties into consideration, realistic suggestions to encourage data sharing include: 1) that all journals implement a clear data sharing policy (Nosek et al., 2015) that allows reasonable flexibility to take into account cases when data cannot be shared because of ethical or identity protection concerns, or that allow ‘embargo’ periods during which data are not shared (Banks et al., 2019); 2) that journals systematically require data sharing during the review process, to help reviewers to evaluate the results (this would have the additional benefit of meaning that no additional effort is required afterward to make the data public); 3) that training courses in Responsible Conduct of Research (RCR) include methods to de-identify study participants and aggregate their results (a major prerequisite to data sharing [Banks et al., 2019]); and 4) the creation of awards for researchers who promote data sharing (Callier, 2019). Finally, we need to know whether improved methodology and increased data sharing are really leading to reproducible research. Unfortunately, we could not locate studies that have addressed this question, making this an important line of future research. Institution level practices: promising and proven remedies looking for suitors When it comes to institutional practices that could strengthen the trustworthiness of research, surely the holy grail would be to better align researcher incentives with good science (Ware and Munafò, 2015). This would be a heavy lift since it would involve changes to how institutions collectively approach recruitment, tenure and promotion. Rather than relying upon current surrogates such as bibliometrics for assessing faculty productivity and success (McKiernan, 2019), they would need to use more direct measures of good science. A workshop involving research quality and other experts was convened in Washington DC in 2017 to explore what such measures might be and how they might be used. It identified six key principles that institutions could embrace to effect such a transition (Moher et al., 2018), but their effectiveness remains untested as they have yet to be implemented. It is worth noting, however, that at least one institution – the University Medical Center Utrecht – has tried to reengineer how it assesses its research programs and faculty in order to better align incentives 5 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research with good science. In the words of the champions of that change initiative, they are learning how to better “shape the structures that shape science. . .[to] make sure that [those structures] do not warp it” (Benedictus et al., 2016). There are smaller scale reforms that institutions could also embrace to help ensure high quality standards in research. For example, there are many innovative practices that institutions could currently use to prevent problems, but are not. Perhaps the most obvious one is a research data audit. Akin to a finance audit, a research data audit is meant to check that published data are “quantifiable and verifiable" by examining “the degree of correspondence of the published data with the original source data” (Shamoo, 2013). First proposed at scientific conferences in the 1970 s, (Shamoo, 2013) and later in print in Nature in 1987 (Dawson, 1987), such audits “would typically require the examination of data in laboratory notebooks and other work sheets, upon which research publications are based” (Glick, 1989). Advocates argue that data audits should be routine in as many settings as possible. This would provide a double benefit; it would help to deter fraud on the one hand and promote quality assurance on the other (Shamoo, 2013). The FDA and the United States Office of Research Integrity currently conduct such audits ‘for cause’ when misconduct or other misbehaviors are suspected. The FDA also uses them for certain new drugs deemed to be potentially ‘high risk.’ Although most current audits typically review the proper use of specified research procedures, there is no reason that they could not also be used to encourage the proper generation and use of actual data (Shamoo, 2013). Critical incident reporting (CRI) is another promising prevention practice. It can be used to uncover problems, that, if left unchecked, might prove detrimental to a group’s research or reports about their research. Open software exists for implementing such a system. Accessed anonymously online, the system prompts users to report in their own terms what happened that is of concern to them. Experts can then promptly analyze incidents to see what systems changes might prevent future recurrences. The first adopters of such a system report that it “has led to the emergence of a mature error culture, and has made the laboratory a safer and more communicative environment” (Dirnagl et al., 2016). The same opportunity pertains to two other successful problem reduction methods: root cause analysis (RCA) and failure modes and Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 effects analysis (FMEA) (Yarborough, 2014a). RCA examines past near misses and problems in order to identify their main contributors. FMEA anticipates ways that future concerns might occur and prioritizes the severity of negative consequences if they do occur (for example, in aviation one might compare increased fuel consumption by a plane versus the catastrophic failure of a wing). The most critically needed preventive measures can then be targeted to avoid severe problems occurring in the first place. RCA and FMEA have both been used to good effect across a wide spectrum of industries and endeavors, including the pharmaceutical industry and clinical medicine. Their track record clearly shows that they can be used to reduce medication, surgical and anesthesia errors, and ensure quality in the drug manufacturing process. Both these methods lend themselves most easily to manufacturing and engineering settings, but their successes suggest they also warrant testing for use in research. In particular, they may improve the human factors that can lead to avoidable problems, especially in teambased science settings where geographic dispersion and distributed expertise are the norm (Yarborough, 2014a; Dirnagl et al., 2016). It seems clear that data audits, CRI, RCA, and FMEA each have tremendous potential for improving research: potential that, like the above publishing reforms and researcher practices, has gone largely untapped to this point. We worry that the four erroneous beliefs that we have highlighted are blunting curiosity about the health of biomedical research, and are thereby preventing the adoption of a more proactive stance toward quality concerns. Hence, a critical next challenge is learning how to erode the appeal of these beliefs. One strategy that we think is particularly worth considering is education. A wider appreciation of evidence that demonstrates the range and extent of quality concerns in research, combined with evidence about how few of them stem from research misconduct, should diminish belief that a few bad apples are our biggest problems. A placeholder for this education is already in place. RCR education is now firmly ensconced in many graduate and postgraduate life sciences courses and could naturally incorporate modules that tackle the erroneous beliefs head on. We should note, however, that this strategy is far from perfect, given longstanding concerns about the effectiveness of RCR curricula 6 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research There are plenty of thoughtfully tailored recommendations that have not yet resulted in the improvements to research they are surely capable of producing (Antes et al., 2010; Presidential Commission for the Study of Bioethical Issues, 2011) and the fact that sponsors who mandate RCR instruction, like the National Institutes of Health (NIH) and the National Science Foundation (NSF) in the United States, often stipulate content that needs to be covered by it. The latter challenge need not be insuperable, though, since both NIH and NSF also encourage innovation and customization of RCR learning activities. Using RCR education as a vehicle for fostering improved quality in research may also help to make such instruction appear more relevant to the careers of learners. As an example, RCR sessions could examine the scientific record on self-correction. The aforementioned cancer and psychology replication projects would surely warrant consideration, but we think that an equally relevant and highly illustrative case study showing how this might be done is a recently published study (Border et al., 2019) about the lasting detrimental impact of a 1996 study about the SLC6A4 gene on depression research (Lesch et al., 1996). This publication spurred at least an additional 450 published ones, consumed millions of dollars, and controversy about it continues to this day (Yong, 2019). Such case studies can drive home multiple lessons because they simultaneously show how science cannot be relied upon to self-correct in a timely or efficient way and that regulations often fail to touch upon matters critical to the health of research. Conclusion Readers may be tempted to dismiss the foregoing analysis of erroneous beliefs as mere personal observations. They may prefer instead either hard data about how research measures up against metrics that contribute to deserving trust. Or they may wish for yet another round of study design and data analysis Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 recommendations capable of solving the broad range of ills currently diminishing the quality of research. The recommendations would plot the path to progress while the data would make our pace of progress apparent to all. As we have tried to make clear, there are plenty of thoughtfully tailored recommendations that have not yet resulted in the improvements to research they are surely capable of producing – simply because there has been too little uptake of them. Nor, for that matter, is there any shortage of calls to arms and manifestos, including those from some of the most eminent scholars and leaders in biomedical research (Alberts et al., 2014; Munafò et al., 2017). Since these have had such little effect so far, especially at the institutional level, it is not clear why we would expect yet more recommendations to enjoy a better reception. Besides, many questionable research practices are hidden from view. For example, inconvenient data points, or even entire experiments, are at times ignored (Martinson et al., 2005); data are added to experiments until desired p-values are obtained (Simmons et al., 2011); and unreliable methods are used when randomizing animals in studies (Institute for Laboratory Animal Research Roundtable on Science and Welfare in Laboratory Animal Use, 2015). Because these behaviors are hidden, traditional metrics are unlikely to capture their extent or their influence on the trustworthiness of research. These behaviors notwithstanding, ‘open science’ practices would be one way to increase confidence in research results that could also provide metrics of trustworthiness. For example, some questionable research practices, such as p-hacking (Head et al., 2015), could be detected more easily by requiring that data and analysis code be publicly available in all but the most exceptional circumstances. Indeed, one group has called for traditional institutional performance metrics such as impact factor and number of publications to be replaced with open science metrics (Barnett and Moher, 2019). Although measurable open science would not eliminate questionable research practices, it would move biomedical research toward increased accountability. Open science practices are still no panacea, however, for all the quality concerns we have highlighted here. What is most needed at this juncture is a collective focus on deserving trust. Such a focus could make researchers and the leaders of research institutions more receptive to reform efforts. The four erroneous beliefs we 7 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research have discussed surely hinder that collective focus, and thus deter the research community from adopting reforms that can secure the public’s trust – which is vital to biomedical research. Mark Yarborough is in the Bioethics Program, University of California, Davis, Sacramento, CA, United States mayarborough@ucdavis.edu https://orcid.org/0000-0001-8188-4968 Robert Nadon is in the Department of Human Genetics, McGill University, Montreal, Canada David G Karlin is an independent researcher based in Marseille, France Author contributions: Mark Yarborough, Robert Nadon, David G Karlin, Conceptualization, Writing— original draft, Writing—review and editing Competing interests: The authors declare that no competing interests exist. Received 17 January 2019 Accepted 25 July 2019 Published 29 July 2019 Funding The authors declare that there was no funding for this work References Agnoli F, Wicherts JM, Veldkamp CL, Albiero P, Cubelli R. 2017. Questionable research practices among Italian research psychologists. PLOS ONE 12: e0172792. DOI: https://doi.org/10.1371/journal.pone. 0172792, PMID: 28296929 Alberts B, Kirschner MW, Tilghman S, Varmus H. 2014. Rescuing US biomedical research from its systemic flaws. PNAS 111:5773–5777. DOI: https://doi.org/10. 1073/pnas.1404402111, PMID: 24733905 Allchin D. 2015. Correcting the “self-correcting” mythos of science. Filosofia E História Da Biologia 10: 19–35. Altman DG. 1994. The scandal of poor medical research. BMJ 308:283–284. DOI: https://doi.org/10. 1136/bmj.308.6924.283, PMID: 8124111 Antes AL, Wang X, Mumford MD, Brown RP, Connelly S, Devenport LD. 2010. Evaluating the effects that existing instruction on responsible conduct of research has on ethical decision making. Academic Medicine 85:519–526. DOI: https://doi.org/10.1097/ACM. 0b013e3181cd1cc5, PMID: 20182131 Banks GC, Field JG, Oswald FL, O’Boyle EH, Landis RS, Rupp DE, Rogelberg SG. 2019. Answers to 18 questions about open science practices. Journal of Business and Psychology 34:257–270. DOI: https://doi. org/10.1007/s10869-018-9547-8 Banobi JA, Branch TA, Hilborn R. 2011. Do rebuttals affect future science? Ecosphere 2:art37. DOI: https:// doi.org/10.1890/ES10-00142.1 Bar-Ilan J, Halevi G. 2017. Post retraction citations in context: A case study. Scientometrics 113:547–565. Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 DOI: https://doi.org/10.1007/s11192-017-2242-0, PMID: 29056790 Barnett AG, Moher D. 2019. Turning the tables: A university league-table based on quality not quantity [version 1; peer review: 1 approved]. F1000Research. Benedictus R, Miedema F, Ferguson MW. 2016. Fewer numbers, better science. Nature 538:453–455. DOI: https://doi.org/10.1038/538453a, PMID: 2778621 9 Border R, Johnson EC, Evans LM, Smolen A, Berley N, Sullivan PF, Keller MC. 2019. No support for historical candidate gene or candidate gene-by-interaction hypotheses for major depression across multiple large samples. American Journal of Psychiatry 176:376–387. DOI: https://doi.org/10.1176/appi.ajp.2018.18070881, PMID: 30845820 Callier V. 2019. The open data explosion. The Scientist. https://www.the-scientist.com/careers/theopen-data-explosion-65248 [Accessed July 18, 2019]. Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gülmezoglu AM, Howells DW, Ioannidis JPA, Oliver S. 2014. How to increase value and reduce waste when research priorities are set. The Lancet 383: 156–165. DOI: https://doi.org/10.1016/S0140-6736(13) 62229-1 Chalmers I, Glasziou P. 2009. Avoidable waste in the production and reporting of research evidence. The Lancet 374:86–89. DOI: https://doi.org/10.1016/ S0140-6736(09)60329-9 Chan A-W, Song F, Vickers A, Jefferson T, Dickersin K, Gøtzsche PC, Krumholz HM, Ghersi D, van der Worp HB. 2014. Increasing value and reducing waste: Addressing inaccessible research. The Lancet 383:257– 266. DOI: https://doi.org/10.1016/S0140-6736(13) 62296-5 Couzin-Frankel J. 2013. Complete. Repeat? Initiative gets $1.3 million to try to replicate cancer studies. Science. https://www.sciencemag.org/news/2013/10/ complete-repeat-initiative-gets-13-million-try-replicatecancer-studiesJuly 18, 2019]. Dawson NJ. 1987. Ensuring scientific integrity. Nature 327:550. DOI: https://doi.org/10.1038/327550a0 Dirnagl U, Przesdzing I, Kurreck C, Major S. 2016. A laboratory critical incident and error reporting system for experimental biomedicine. PLOS Biology 14: e2000705. DOI: https://doi.org/10.1371/journal.pbio. 2000705 Drew A. 2019. APS replication initiative under way. Observer. Vol 26: Association for Psychological Science 2013. https://www.psychologicalscience.org/ observer/aps-replication-initiative-underway [Accessed July 18, 2019]. Enserink M. 2017. Sloppy reporting on animal studies proves hard to change. Science 357:1337–1338. DOI: https://doi.org/10.1126/science.357.6358.1337, PMID: 28963232 Fanelli D. 2009. How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLOS ONE 4:e5738. DOI: https://doi. org/10.1371/journal.pone.0005738, PMID: 19478950 Fanelli D. 2016. Set up a ’self-retraction’ system for honest errors. Nature 531:415. DOI: https://doi.org/ 10.1038/531415a, PMID: 27008933 Fanelli D, Ioannidis JPA, Goodman S. 2018. Improving the integrity of published science: An expanded taxonomy of retractions and corrections. European 8 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research Journal of Clinical Investigation 48:e12898. DOI: https://doi.org/10.1111/eci.12898 Fang FC, Steen RG, Casadevall A. 2012. Misconduct accounts for the majority of retracted scientific publications. PNAS 109:17028–17033. DOI: https:// doi.org/10.1073/pnas.1212247109, PMID: 23027971 Gazni A, Sugimoto CR, Didegah F. 2012. Mapping world scientific collaboration: Authors, institutions, and countries. Journal of the American Society for Information Science and Technology 63:323–335. DOI: https://doi.org/10.1002/asi.21688 Glasziou P, Altman DG, Bossuyt P, Boutron I, Clarke M, Julious S, Michie S, Moher D, Wager E. 2014. Reducing waste from incomplete or unusable reports of biomedical research. The Lancet 383:267–276. DOI: https://doi.org/10.1016/S0140-6736(13)62228-X Glick JL. 1989. On the cost effectiveness of data auditing. In: Shamoo A. E (Ed). Principles of Research Data Audit. Taylor & Francis. Godfrey MW, German DM. 2008. The past, present and future of software evolution. 2008 Frontiers of Software Maintenance. DOI: https://doi.org/10.1109/ fosm.2008.4659256 Grieneisen ML, Zhang M. 2012. A comprehensive survey of retracted articles from the scholarly literature. PLOS ONE 7:e44118. DOI: https://doi.org/ 10.1371/journal.pone.0044118, PMID: 23115617 Hair K, Macleod M, Sena E, IICARus Collaboration. 2018. A randomised controlled trial of an intervention to improve compliance with the ARRIVE guidelines (IICARus). bioRxiv. DOI: https://doi.org/10.1101/ 370874 Han S, Olonisakin TF, Pribis JP, Zupetic J, Yoon JH, Holleran KM, Jeong K, Shaikh N, Rubio DM, Lee JS. 2017. A checklist is associated with increased quality of reporting preclinical biomedical research: A systematic review. PLOS ONE 12:e0183591. DOI: https://doi.org/10.1371/journal.pone.0183591, PMID: 28902887 Hardin R. 2002. Trust and Trustworthiness. New York: Russell Sage Foundation. Harris RF. 2017. Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions. New York: Basic Books. He X, Zhang J. 2009. On the growth of scientific knowledge: Yeast biology as a case study. PLOS Computational Biology 5:e1000320. DOI: https://doi. org/10.1371/journal.pcbi.1000320, PMID: 19300476 Head ML, Holman L, Lanfear R, Kahn AT, Jennions MD. 2015. The extent and consequences of p-hacking in science. PLOS Biology 13:e1002106. DOI: https:// doi.org/10.1371/journal.pbio.1002106, PMID: 2576 8323 Hines WC, Su Y, Kuhn I, Polyak K, Bissell MJ. 2014. Sorting out the FACS: A devil in the details. Cell Reports 6:779–781. DOI: https://doi.org/10.1016/j. celrep.2014.02.021, PMID: 24630040 Hudson P. 2003. Applying the lessons of high risk industries to health care. Quality and Safety in Health Care 12:7i–12. DOI: https://doi.org/10.1136/qhc.12. suppl_1.i7 Institute for Laboratory Animal Research Roundtable on Science and Welfare in Laboratory Animal Use. 2015. Reproducibility issues in research with animals and animal models workshop in brief October 2015. https://www.nap.edu/read/21835/ #slide1 [Accessed July 18, 2019]. Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 Institute of Medicine. 2013. Sharing Clinical Research Data:Workshop Summary. Washington, DC: Institute of Medicine. Ioannidis JP. 2005. Why most published research findings are false. PLOS Medicine 2:e124. DOI: https:// doi.org/10.1371/journal.pmed.0020124, PMID: 16060722 Ioannidis JP. 2014. How to make more published research true. PLOS Medicine 11:e1001747. DOI: https://doi.org/10.1371/journal.pmed.1001747, PMID: 25334033 Ioannidis JPA, Greenland S, Hlatky MA, Khoury MJ, Macleod MR, Moher D, Schulz KF, Tibshirani R. 2014. Increasing value and reducing waste in research design, conduct, and analysis. The Lancet 383:166– 175. DOI: https://doi.org/10.1016/S0140-6736(13) 62227-8 Jasny BR, Chin G, Chong L, Vignieri S. 2011. Again, and again, and again ... Science 334:1225. DOI: https://doi.org/10.1126/science.334.6060.1225 John LK, Loewenstein G, Prelec D. 2012. Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science 23: 524–532. DOI: https://doi.org/10.1177/ 0956797611430953, PMID: 22508865 Judson HF. 2004. The Great Betrayal: Fraud in Science. Orlando: Harcourt, Inc. Kaiser J. 2018. Plan to replicate 50 high-impact cancer papers shrinks to just 18. Science. http://www. sciencemag.org/news/2018/07/plan-replicate-50-highimpact-cancer-papers-shrinks-just-18 [Accessed August 6, 2018]. Kimmelman J, Mogil JS, Dirnagl U. 2014. Distinguishing between exploratory and confirmatory preclinical research will improve translation. PLOS Biology 12:e1001863. DOI: https://doi.org/10.1371/ journal.pbio.1001863, PMID: 24844265 Lesch KP, Bengel D, Heils A, Sabol SZ, Greenberg BD, Petri S, Benjamin J, Müller CR, Hamer DH, Murphy DL. 1996. Association of anxiety-related traits with a polymorphism in the serotonin transporter gene regulatory region. Science 274:1527–1531. DOI: https://doi.org/10.1126/science.274.5292.1527, PMID: 8929413 Lund H, Brunnhuber K, Juhl C, Robinson K, Leenaars M, Dorch BF, Jamtvedt G, Nortvedt MW, Christensen R, Chalmers I. 2016. Towards evidence based research. BMJ 355:i5440. DOI: https://doi.org/10.1136/bmj. i5440, PMID: 27797786 Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JP, Al-Shahi Salman R, Chan AW, Glasziou P. 2014. Biomedical research: Increasing value, reducing waste. The Lancet 383:101–104. DOI: https:// doi.org/10.1016/S0140-6736(13)62329-6, PMID: 24411643 Macleod MR. 2014. Preclinical research: Design animal studies better. Nature 510:35. DOI: https://doi.org/10. 1038/510035a, PMID: 24899295 Macleod MR, Lawson McLean A, Kyriakopoulou A, Serghiou S, de Wilde A, Sherratt N, Hirst T, Hemblade R, Bahor Z, Nunes-Fonseca C, Potluru A, Thomson A, Baginskitae J, Egan K, Vesterinen H, Currie GL, Churilov L, Howells DW, Sena ES. 2015. Risk of bias in reports of in vivo research: A focus for improvement. PLOS Biology 13:e1002273. DOI: https://doi.org/10. 1371/journal.pbio.1002273 9 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research Martinson BC, Anderson MS, de Vries R. 2005. Scientists behaving badly. Nature 435:737–738. DOI: https://doi.org/10.1038/435737a, PMID: 15 944677 McKiernan EC. 2019. Use of the journal impact factor in academic review, promotion, and tenure evaluations. PeerJ Preprints 7:e27638. Michalek AM, Hutson AD, Wicher CP, Trump DL. 2010. The costs and underappreciated consequences of research misconduct: a case study. PLOS Medicine 7:e1000318. DOI: https://doi.org/10.1371/journal. pmed.1000318, PMID: 20808955 Minnerup J, Zentsch V, Schmidt A, Fisher M, Schäbitz WR. 2016. Methodological quality of experimental stroke studies published in the stroke journal: Time trends and effect of the basic science checklist. Stroke 47:267–272. DOI: https://doi.org/10.1161/ STROKEAHA.115.011695, PMID: 26658439 Moher D, Glasziou P, Chalmers I, Nasser M, Bossuyt PMM, Korevaar DA, Graham ID, Ravaud P, Boutron I. 2016. Increasing value and reducing waste in biomedical research: who’s listening? The Lancet 387: 1573–1586. DOI: https://doi.org/10.1016/S0140-6736 (15)00307-4 Moher D, Naudet F, Cristea IA, Miedema F, Ioannidis JPA, Goodman SN. 2018. Assessing scientists for hiring, promotion, and tenure. PLOS Biology 16: e2004089. DOI: https://doi.org/10.1371/journal.pbio. 2004089, PMID: 29596415 Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, Simonsohn U, Wagenmakers E-J, Ware JJ, Ioannidis JPA. 2017. A manifesto for reproducible science. Nature Human Behaviour 1:0021. DOI: https://doi.org/10.1038/ s41562-016-0021 Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, Buck S, Chambers CD, Chin G, Christensen G, Contestabile M, Dafoe A, Eich E, Freese J, Glennerster R, Goroff D, Green DP, Hesse B, Humphreys M, Ishiyama J, et al. 2015. Promoting an open research culture. Science 348:1422–1425. DOI: https://doi.org/10.1126/science.aab2374 Office of Research Integrity. 2015. Historical background. https://ori.hhs.gov/historical-background [Accessed July 8, 2015]. Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science 349: aac4716. DOI: https://doi.org/10.1126/science. aac4716, PMID: 26315443 Peers IS, South MC, Ceuppens PR, Bright JD, Pilling E. 2014. Can you trust your animal study data? Nature Reviews Drug Discovery 13:560. DOI: https://doi.org/ 10.1038/nrd4090-c1, PMID: 24903777 Peng RD. 2011. Reproducible research in computational science. Science 334:1226–1227. DOI: https://doi.org/10.1126/science.1213847, PMID: 22144613 Presidential Commission for the Study of Bioethical Issues. 2011. "Ethically Impossible" STD Research in Guatemala from 1946 to1948. https://bioethicsarchive. georgetown.edu/pcsbi/sites/default/files/Ethically% 20Impossible%20(with%20linked%20historical% 20documents)%202.7.13.pdfJuly 18, 2019]. Robinson KA, Goodman SN. 2011. A systematic examination of the citation of prior research in reports of randomized, controlled trials. Annals of Internal Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 Medicine 154:50–55. DOI: https://doi.org/10.7326/ 0003-4819-154-1-201101040-00007, PMID: 21200038 Salman RA-S, Beller E, Kagan J, Hemminki E, Phillips RS, Savulescu J, Macleod M, Wisely J, Chalmers I. 2014. Increasing value and reducing waste in biomedical research regulation and management. The Lancet 383:176–185. DOI: https://doi.org/10.1016/ S0140-6736(13)62297-7 Sena ES, van der Worp HB, Bath PM, Howells DW, Macleod MR. 2010. Publication bias in reports of animal stroke studies leads to major overstatement of efficacy. PLOS Biology 8:e1000344. DOI: https://doi. org/10.1371/journal.pbio.1000344, PMID: 20361022 Shamoo AE. 2013. Data audit as a way to prevent/ contain misconduct. Accountability in Research 20: 369–379. DOI: https://doi.org/10.1080/08989621. 2013.822259, PMID: 24028483 Simmons JP, Nelson LD, Simonsohn U. 2011. Falsepositive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22:1359–1366. DOI: https://doi.org/10.1177/0956797611417632, PMID: 22006061 Smith R. 2014. Why scientists should be held to a higher standard of honesty than the average person. The BMJ. https://blogs.bmj.com/bmj/2014/09/02/ richard-smith-why-scientists-should-be-held-to-ahigher-standard-of-honesty-than-the-average-person/ [Accessed July 25, 2019]. Steen RG. 2011. Retractions in the medical literature: How many patients are put at risk by flawed research? Journal of Medical Ethics 37:688–692. DOI: https:// doi.org/10.1136/jme.2011.043133, PMID: 21586404 Steen RG, Casadevall A, Fang FC. 2013. Why has the number of scientific retractions increased? PLOS ONE 8:e68397. DOI: https://doi.org/10.1371/journal.pone. 0068397, PMID: 23861902 Stodden V, Seiler J, Ma Z. 2018. An empirical analysis of journal policy effectiveness for computational reproducibility. PNAS 115:2584–2589. DOI: https:// doi.org/10.1073/pnas.1708290115, PMID: 29531050 The NPQIP Collaborative group. 2019. Did a change in Nature journals’ editorial policy for life sciences research improve reporting? BMJ Open Science 3: e000035. DOI: https://doi.org/10.1136/bmjos-2017000035 Tsilidis KK, Panagiotou OA, Sena ES, Aretouli E, Evangelou E, Howells DW, Al-Shahi Salman R, Macleod MR, Ioannidis JP. 2013. Evaluation of excess significance bias in animal studies of neurological diseases. PLOS Biology 11:e1001609. DOI: https://doi. org/10.1371/journal.pbio.1001609, PMID: 23874156 Twaij H, Oussedik S, Hoffmeyer P. 2014. Peer review. The Bone & Joint Journal 96-B:436–441. DOI: https:// doi.org/10.1302/0301-620X.96B4.33041, PMID: 246 92607 Ware JJ, Munafò MR. 2015. Significance chasing in research practice: Causes, consequences and possible solutions. Addiction 110:4–8. DOI: https://doi.org/10. 1111/add.12673, PMID: 25040652 Williams HL. 2010. Intellectual property rights and innovation: Evidence from the human genome. Journal of Political Economy 121:1–27. DOI: https://doi.org/ 10.1086/669706, PMID: 24639594 Wuchty S, Jones BF, Uzzi B. 2007. The increasing dominance of teams in production of knowledge. 10 of 11 Feature Article Point of View Four erroneous beliefs thwarting more trustworthy research Science 316:1036–1039. DOI: https://doi.org/10.1126/ science.1136099, PMID: 17431139 Yarborough M, Fryer-Edwards K, Geller G, Sharp RR. 2009. Transforming the culture of biomedical research from compliance to trustworthiness: Insights from nonmedical sectors. Academic Medicine 84:472–477. DOI: https://doi.org/10.1097/ACM. 0b013e31819a8aa6, PMID: 19318781 Yarborough M. 2014a. Taking steps to increase the trustworthiness of scientific research. The FASEB Journal 28:3841–3846. DOI: https://doi.org/10.1096/fj. 13-246603, PMID: 24928193 Yarborough et al. eLife 2019;8:e45261. DOI: https://doi.org/10.7554/eLife.45261 Yarborough M. 2014b. Openness in science is key to keeping public trust. Nature 515:313. DOI: https://doi. org/10.1038/515313a, PMID: 25409791 Yong E. 2019. A waste of 1,000 research papers. The Atlantic. https://www.theatlantic.com/science/archive/ 2019/05/waste-1000-studies/589684/ [Accessed July 18, 2019]. Zimmer C. 2011. It’s science, but not necessarily right. International Herald Tribune. https://carlzimmer.com/ its-science-but-not-necessarily-right-293/ [Accessed August 7, 2019]. 11 of 11

Log In

Author response

Related papers

Related papers

Related topics