To Catch a Thief: Tools and Tips to Combat Digital Content Plagiarism

Page 2 of 3


The Academics of Plagiarism
Every student knows that turning in plagiarized work is worse than turning in no work at all—if you get caught. But getting caught isn’t certain. About 80% of college students admitted to plagiarizing content at least once, according to The Center for Academic Integrity.

“The problem of plagiarism seems to take root in the academic field,” said Alena Siameshka, head of the marketing department for SearchInform Technologies Inc., which released the plagiarism search tool PlagiatInform in June 2007. “Unlike a few decades back, today it’s next to impossible for teachers to read all the literature and publications on a particular discipline. The internet has granted access to rare sources, libraries, and journals, enabling students to get information from the sources their instructors might not be familiar with.”

As a result, academic institutions from grade schools on up have been looking for ways to stem the tide by arming professors and teachers with digital tools. PlagiatInform allows professors to cross-check student papers against the university’s submission database and returns an estimate of what percent of the work has been lifted from another source. If the estimate is low, the paper is divided into paragraphs for comparison; if the estimate is high, the search highlights any suspicious text and links to possible sources line by line.

PlagiatInform’s design takes into account the massive source material from which cheaters draw. As papers are turned in, they’re added to the master database of student works, while bots automatically mine the internet and add relevant academic papers and sites. And as the database expands, the PlagiatInform software allows for storage and search capacity to increase accordingly without any manual upgrades or overhauls. And as more universities get on board with PlagiatInform, Siameshka said that those schools would be able to network with each other and search each other’s paper databases.

Another option is Turnitin, iParadigms’ anti-plagiarism tool and digital assessment suite, used by over 7,500 schools so far. It operates either as a standalone application or as an integrated part of the school’s content management and communications systems. When a paper is submitted by a student, it’s checked against a cache of more than 12 billion internet pages collected and stored by Turnitin’s specially-designed spider, a database of 40 million student papers and licensed third-party content like newspapers, magazines, and books from educational information providers including Thomson Gale.

With that much content to cross-reference, similar sentence patterns are bound to turn up even on wholly original student work. To avoid career-ruining accusations, Turnitin offers Originality Reports and shows what percent of the flagged copy is suspicious, much like PlagiatInform’s percent-based results. “The report in no way states that a submission is or is not plagiarized,” pointed out Melissa Lipscomb, COO of iParadigms. “Rather, we create an unambiguous, objective report that can be used by our client to make the final decision on whether a paper has been cut and pasted.”

Still, some institutions feel that these tools undermine the trust it shows in its students, and several of the most prestigious universities in the country—including Harvard, Yale, and Princeton—have no institution-wide tool to catch plagiarists. If a professor has a hunch, he or she just has to follow it up in the “old- fashioned” way: through Google or another search engine.

The Businesses of Plagiarism
Employee plagiarism can tarnish the integrity of any company, whether it’s a front-page story in The New York Times or an ad slogan lifted by a lazy copywriter. On the other end, companies whose protected content is being ripped off lose out on profit from that content.

“The financial liability is tremendous to the infringing business. Regarding businesses whose content is stolen, the impact is obvious: It de-values their intellectual property,” said Lipscomb.

The corporate counterpart to iParadigms’ Turnitin plagiarism-detection tool is iThenticate. The web-based application scans an extensive page cache and content database and, if any pattern matches are detected, the content is flagged and shown alongside the source.

In 2005, LexisNexis partnered with iThenticate to create CopyGuard. Using the same basic match-and-report system of iThenticate, CopyGuard is available to LexisNexis subscribers. It broadens search comprehensiveness by combining over 6 billion LexisNexis digital documents with iThenticate’s web page archive.

As many businesses and publications launch websites or digital publications, they face the possibility that their copyrighted content will wind up on someone else’s site. “Any website with good content or marketing copy is likely to be copied—the problem is that it’s just so easy for someone to copy the text from your site and make a few minor modifications to suit their purposes,” said Gideon Greenspan, co-founder of Indigo Stream Technologies Ltd., which launched the anti-plagiarism web tool Copyscape in 2004.

Copyscape is a search engine that scans the content of a specific URL and runs a pattern-matching algorithm to find potential plagiarized copies across the rest of the visible web. Suspicious sites are returned with the questionable content highlighted.

Anti-plagiarism tools designed for the enterprise sector such as iThenticate and Copyscape are designed to protect the company both from potential liability and potential loss. They rely heavily on access to a rich source pool, both from across the web and from the protected digital databases of companies like LexisNexis. “The real issue is what you have access to,” said Tom Holt. The more access, the less likely corporate plagiarists will get away with their theft.

Page 2 of 3