VIGOROUS debates have erupted across science in recent years. Climate research, psychology findings, medical trials—all have faced questions about whether they’re truly reliable. Now a team of researchers has proposed a framework that could help separate genuinely trustworthy research from the merely well-publicised.
The new system, published today in the Proceedings of the National Academy of Sciences, tackles a surprisingly tricky question. As Brian Nosek at the Center for Open Science and his colleagues put it: “The question ‘what makes research findings trustworthy?’ elicits different answers depending on whether the emphasis is on research integrity and ethics, research methods, transparency, inclusion, assessment and peer review, or scholarly communication.” Each perspective offers insights, they say, but none captures the whole picture.
The framework breaks trustworthiness down into seven distinct components. Research must be accountable, evaluable (can others actually examine it?), well-formulated, and it must control bias whilst reducing error. It should’ve been evaluated by others. And crucially—this one’s rather interesting—the claims need to match the evidence. Seems obvious. Yet it’s surprisingly easy to overstate findings or downplay uncertainties in that final leap from data to conclusion.
Perhaps the most important insight: trustworthiness isn’t the same as correctness. A study can be wrong and still trustworthy, provided it’s conducted and reported in ways that make the errors detectable. That’s the bit that matters for scientific progress (besides, eventually being right). As the team notes, “trustworthy research findings are those that contribute productively to scholarly dialogue about evidence and claims.” Trustworthy findings are produced in ways that make errors detectable and correction possible over time—and “this ability to detect and correct errors is what allows scientific knowledge to progress.”
The framework assesses trustworthiness at three levels: the research itself, the researchers conducting it, and the organisations supporting it. For instance, whether data is shared publicly affects evaluability at the research level. Whether researchers disclose conflicts of interest contributes to accountability at the researcher level. And whether universities reward rigorous practices rather than just publication counts influences trustworthiness at the organisational level.
Nosek and colleagues—including Kathleen Hall Jamieson at the University of Pennsylvania and Marcia McNutt at the National Academies—argue that current approaches rely too heavily on proxies. Journal prestige, citation counts, even peer review itself can’t fully capture whether research is trustworthy. “The rigor of peer review varies substantially from journal to journal,” they write. When “published in a peer-reviewed journal” becomes a proxy for quality, it creates perverse incentives; predatory journals publish with minimal oversight, paper mills sell authorship positions. High publication counts can offer “the veneer of trustworthiness replacing the need to conduct genuinely trustworthy research.”
The framework prioritises measurable, behavioural indicators instead. Did researchers pre-register their analysis plan? Is the sample size adequate to detect the effect they’re studying? Are conflicts disclosed? These direct indicators, whilst harder to assess than journal names, actually tell you something meaningful about the work.
Some indicators are straightforward to measure—sample size, statistical power, reliability estimates. Others need innovation. How do you assess whether claims are “well-calibrated” with evidence? Whether research properly considers alternative perspectives? The authors acknowledge that translating principles into valid, scalable indicators remains an ongoing challenge.
Not every indicator will suit every field. The framework aims for broad applicability across research domains, from particle physics to ethnography, but the specific indicators will differ. What constitutes adequate sample size in neuroscience versus anthropology, for instance, or how transparency works with sensitive medical data versus astronomy catalogues.
The team’s goal is to stimulate better indicators whilst providing common language for researchers, institutions, funders, and journals grappling with research quality. They also hope clearer indicators might help journalists, policymakers, and the public assess research more accurately. Though perhaps that’s optimistic—it’s one thing to develop sophisticated frameworks, another entirely to get them widely adopted.
Still, the work represents a serious attempt to move beyond reputation-based proxies. As Nosek and colleagues write: “Knowledge production is a hard, slow process. It is even harder and slower when research findings are not trustworthy.” Getting trustworthiness right, then, isn’t just about catching bad actors. It’s about making science itself work better.
Study link: https://www.pnas.org/doi/10.1073/pnas.2536736123
ScienceBlog.com has no paywalls, no sponsored content, and no agenda beyond getting the science right. Every story here is written to inform, not to impress an advertiser or push a point of view.
Good science journalism takes time — reading the papers, checking the claims, finding researchers who can put findings in context. We do that work because we think it matters.
If you find this site useful, consider supporting it with a donation. Even a few dollars a month helps keep the coverage independent and free for everyone.
