Nature reports that Harold Garner of the University of Texas Southwestern Medical Center in Dallas has been scouring the medical literature using an automated text-matching software package to catch plagiarized articles.
A surprising number have been found. 181 papers have been classified as duplicates, sharing 85% of their text, on average, with a previous paper. One quarter of these are nearly 100% identical to a previous publication.
While it is troubling that anybody would be so brazen, the fact that they have gotten away with it so far says something: there are a lot of journals. And a lot of papers. For a plagiarist to be successful, it must be the case that neither the editor nor any of the referees have read the original article — this despite the fact that referees are typically chosen because they are experts in the field the article addresses.
That, I think, is the big news: that it is possible to plagiarize so blatantly.
Incidentally, the Nature news brief suggests that the confirmed plagiarism is usually carried out in obscure journals. This means that the plagiarists are gaining relatively little for their effort, and the original authors are losing little.
Garner’s project has apparently identified 75,000 abstracts that seem highly similar. It’s hard to tell what that means, so we’ll have to wait for the full report.
An abstract is about 200 words long. PsychInfo currently lists 10,098 which contain the phrase “working memory.” One would assume that, even if all of them are examples of independent work, many are highly similar just by random chance. So I hope to find out more about how “highly similar” is being operationalized in this project.
While I suspect that plagiarism is not a huge problem, I still think it is fantastic that people are attacking it with these modern tools. I think we will be seeing a lot more of this type of work. (Actually, come to think of it, a professor I had in 2002 actually used an automated plagiarism-catching software program to screen student homework, so this has been around for a while.)