Research findings posted online as preprints — studies made public before undergoing the review and approval of a panel of peer scientists required by most scholarly journals — often hold up quite well to that scrutiny, according to a new report on COVID-19 studies.
While preprint manuscripts have become popular in many scientific fields since physicists made their arXiv (pronounced “archive”) repository accessible online in 1991, the COVID pandemic pushed new groups of researchers into the habit of posting and consulting fresh experimental results and analyses ahead of peer review.
“Preprints have been broadly accepted in the social sciences, computer sciences, mathematics for quite a long time,” says B. Ian Hutchins, a professor in the University of Wisconsin–Madison’s Information School and leader of the new study of preprint published today in The Lancet Global Health. “Biomedical research has been more cautious, I think, precisely because people use that information for making health-altering decisions.”
The appearance and speedy global spread of a new virus — as well as the quick response by scientists around the world — forced many to reconsider that caution, weighing it against the cost of a typical delay of many months or longer for newly completed studies to clear the hurdles of a careful journal peer review.
A group of journal publishers decided during the pandemic to require preprint availability of COVID-19-related manuscripts submitted for their consideration, according to Hutchins — whose own work was, as it focused on COVID-19 studies, also required to be made available as a (deeply meta) preprint.
The UW–Madison researchers chose at random 100 COVID-19 studies that had been posted as preprints and then subjected to peer review and successfully published by journals. They examined how peer review affected 1,606 data points in the manuscripts, representing four types of data common to the COVID study genre: the closely-related infection fatality rates and case fatality rates, basic viral reproduction rates (how many people an infected person is expected to infect) and disease incidence (the number of new people infected in a given time period).
“That was a strength of using infectious-disease research for this study,” Hutchins says. “Because when you talk about case fatality rate, there’s an agreed-upon definition of what that is, broadly speaking, and so we could make better comparisons of that data across different labs.”
Comparing preprint manuscripts to the eventual published versions of the individual studies, about 90 percent of those 1,606 data points were still in the text after peer review. More than 170 were edited out and more than 300 new data points were added across the 100-study sample.
And while the researchers found the confidence intervals associated with estimates — “that’s like the margins of error you hear about in polling,” Hutchins says — had tightened about 7% after peer review, changes in the actual estimates were minor and statistically insignificant.
“Wild swings between preprint and published versions would be hard to explain,” Hutchins says. “But that’s not what we see. There’s not a whole lot of change in the data reported and the estimates based on that data.”
Quantifying the differences typically seen after studies cross the peer-review finish line can help consumers of the freshest science consider how much weight they give preprint results as they report on discoveries or issue public health guidance.
“Journalists and policymakers should look at the fact that 90% of the data points make it through peer review, should get a sense for how much they usually change, and ask themselves, am I comfortable accepting that degree of change?” Hutchins says. “The answer to that may depend based on the stakes of the decision. If all you’re worried about is your reputation, you might be open to a different amount of risk than if you’re making life-or-death decisions.”
The National Institutes of Health has promoted preprint manuscripts as a way to accelerate the pace of scientific discovery, according to Hutchins, who developed iCite, a curated search tool for COVID-19 research, while working at NIH.
Hutchins co-authored the new study with statistician Honghan Ye, who completed his doctorate at UW–Madison in 2021, and several UW–Madison undergraduate students, and hopes to expand his preprint studies to include a broader range of scientific fields and how preprint quality has changed over time.