Unraveling the NY Times Scandal: Is Information Retrieval Software the Answer?

The fallout from the actions of the New York Times reporter Jayson Blair has been pervasive. Surely Blair will never regain the trust and integrity he sacrificed, whether prosecuted for plagiarism or not. Such are the consequences of his behavior and according to industry pundits, a fate that is cast in stone.

The likelihood of recovery within the Times and, on a larger scale, the journalism industry is not as predictable. On the one hand, the 152-year-old newspaper has an almost unrivaled reputation as a pillar of reliable and accurate reporting. On the other, the idea that journalistic integrity could be so easily breached from within such a trusted source is, for many, unforgivable.

Herein lies the problem at the Times. Brand has been compromised. Competitive differentiation endangered. Consumer confidence—the foundation of the paper's value proposition—has eroded considerably, to say the least. Arthur Sulzberger Jr., chairman of The New York Times Company, hit the nail on the head when describing the scandal in relationship to corporate brand: "It's an abrogation of the trust between the newspaper and its readers. It's a huge black eye."

Naturally, the initial shock spawned investigation. Questions of "how did this happen" transitioned quickly into "how do we prevent this from happening again." For corporate America, numerous lessons can be learned from the lack of information discovery within this $3.1 billion dollar media company. Risk management, effective collaboration, and combating information overload quickly emerged as ingredients in a formula for counterattacking the "Jayson Blair disease."

The fact that information discovery is at the center of the controversy comes as little surprise to developers and implementers of content-driven technologies. The advent of the information age has made corporations re-think the way content is gathered, shared, processed, and distributed. No sector has been spared from the effects of information overload. Why should the media industry be any different?

The conundrum of "info-glut" has little to do with the availability of information. Rather, it is the identification of relevant facts and the unearthing of critical information relationships that prevent clarity. Enter content analysis technologies. For those more technically inclined, this translates into software that provides linguistic and semantic analysis, entity extraction, classification, and automated similarity assessment. Could information retrieval (IR) software have curbed plagiarism and prevented journalistic fraud at the Times? It is safe to speculate that the overhaul of research, fact-checking, and investigative reporting practices in the newsroom could "make or break" the restoration of brand for the company.

The folks at Inxight, Inc., which focuses on Unstructured Data Management (UDM), believe that smarter discovery is part of the remedy. According to David Spenhoff, VP of Marketing, "one of the most powerful advantages of UDM solutions is the ability to do similarity analysis on large numbers of documents to uncover patterns and critical relationships within the context of a specific entity."

Perhaps the Times could borrow a page from IR practices in other information-intensive industries such as life sciences and the government as well as its cousins in the publishing sector. According to Inxight, the ability to automatically gather, share, and disseminate vast amounts of raw data enables the Department of Homeland Security to turn mere facts into accurate, actionable intelligence. Factiva, a Dow Jones & Reuters company, uses Inxight solutions to protect its reputation as a leader in news syndication by automating the categorization of tens of thousands of news articles per day.

Accuracy and thoroughness in organizations like these is paramount. For reporters inundated with raw data, automating the classification and analysis of information could be critical to discovering relevant facts within a news story and consequently developing a unique and newsworthy angle. For editors charged with ensuring accuracy and sound reporting, UDM technologies could serve as a key validation mechanism within editorial processes.

Automated content analysis might have allowed Blair's editors to see that the content he submitted had indeed been published by another author. While editorial oversight can never be supplanted, leveraging information retrieval technologies may prove useful in the Times' efforts to restore and then maintain its reputation for editorial integrity.

(www.dowjones.com; www.nytimes.com; www.inxight.com)