A new study by MIT researchers suggests an alternate approach: Crowdsourced accuracy judgements from groups of normal readers can be virtually as effective as the work of professional fact-checkers.
“One problem with fact-checking is that there is just way too much content for professional fact-checkers to be able to cover, especially within a reasonable time frame,” says Jennifer Allen, a PhD student at the MIT Sloan School of Management and co-author of a newly published paper detailing the study.
But the current study, examining over 200 news stories that Facebook’s algorithms had flagged for further scrutiny, may have found a way to address that problem, by using relatively small, politically balanced groups of lay readers to evaluate the headlines and lead sentences of news stories.
“We found it to be encouraging,” says Allen. “The average rating of a crowd of 10 to 15 people correlated as well with the fact-checkers’ judgments as the fact-checkers correlated with each other. This helps with the scalability problem because these raters were regular people without fact-checking training, and they just read the headlines and lead sentences without spending the time to do any research.”
That means the crowdsourcing method could be deployed widely — and cheaply. The study estimates that the cost of having readers evaluate news this way is about $0.90 per story.
“There’s no one thing that solves the problem of false news online,” says David Rand, a professor at MIT Sloan and senior co-author of the study. “But we’re working to add promising approaches to the anti-misinformation tool kit.”
Intriguingly, when the regular readers recruited for the study were sorted into groups with the same number of Democrats and Republicans, their average ratings were highly correlated with the professional fact-checkers’ ratings — and with at least a double-digit number of readers involved, the crowd’s ratings correlated as strongly with the fact-checkers as the fact-checkers’ did with each other.
“These readers weren’t trained in fact-checking, and they were only reading the headlines and lead sentences, and even so they were able to match the performance of the fact-checkers,” Allen says.
While it might seem initially surprising that a crowd of 12 to 20 readers could match the performance of professional fact-checkers, this is another example of a classic phenomenon: the wisdom of crowds. Across a wide range of applications, groups of laypeople have been found to match or exceed the performance of expert judgments. The current study shows this can occur even in the highly polarizing context of misinformation identification.
The experiment’s participants also took a political knowledge test and a test of their tendency to think analytically. Overall, the ratings of people who were better informed about civic issues and engaged in more analytical thinking were more closely aligned with the fact-checkers.
“People that engaged in more reasoning and were more knowledgeable agreed more with the fact-checkers,” Rand says. “And that was true regardless of whether they were Democrats or Republicans.”
The scholars say the finding could be applied in many ways — and note that some social media behemoths are actively trying to make crowdsourcing work. Facebook has a program, called Community Review, where laypeople are hired to assess news content; Twitter has its own project, Birdwatch, soliciting reader input about the veracity of tweets. The wisdom of crowds can be used either to help apply public-facing labels to content, or to inform ranking algorithms and what content people are shown in the first place.
To be sure, the authors note, any organization using crowdsourcing needs to find a good mechanism for participation by readers. If participation is open to everyone, it is possible the crowdsourcing process could be unfairly influenced by partisans.
“We haven’t yet tested this in an environment where anyone can opt in,” Allen notes. “Platforms shouldn’t necessarily expect that other crowdsourcing strategies would produce equally positive results.”
On the other hand, Rand says, news and social media organizations would have to find ways to get a large enough groups of people actively evaluating news items, in order to make the crowdsourcing work.
“Most people don’t care about politics and care enough to try to influence things,” Rand says. “But the concern is that if you let people rate any content they want, then the only people doing it will be the ones who want to game the system. Still, to me, a bigger concern than being swamped by zealots is the problem that no one would do it. It is a classic public goods problem: Society at large benefits from people identifying misinformation, but why should users bother to invest the time and effort to give ratings?”
The study was supported, in part, by the William and Flora Hewlett Foundation, the John Templeton Foundation, and the Reset project of Omidyar Group’s Luminate Project Limited. Allen is a former Facebook employee who still has a financial interest in Facebook; other studies by Rand are supported, in part, by Google.