Company: Thomson Reuters
A worldwide provider of information to businesses and professionals, Thomson Reuters was formed when The Thomson Corp. combined with Reuters Group PLC in the spring of 2008. As a company built around content, Thomson Reuters started its Rapid Source Automation (RSA) initiative to simplify the process by which it collected data.
Thomson Reuters needed to find a way to streamline its data monitoring and collection processes so that they could increase productivity and improve the quality of the data that they brought in. Content operation groups within the organization had been searching hundreds of websites manually, an ad hoc and unmanaged routine that didn’t easily allow the sharing of best practices between the groups. Not only was it time-consuming, but it also posed a risk of coming up with inaccurate and incomplete data.
Vendor of Choice: Connotate
Founded in 2000, Connotate was developed by researchers at the Rutgers University Computer Science Department in New Brunswick, N.J. Used by global clients including Standard & Poor’s and The Associated Press, Connotate’s web data collection solutions can monitor large volumes of data quickly and efficiently in order to simplify operations for those in the information industry.
The Problem In-Depth
Thomson Reuters wanted to do things differently for its customers, which ultimately meant being able to deliver more content at a faster pace and in a more consistent manner. Pedro Saraiva, product manager of RSA at Thomson Reuters, says that there were far too many different groups around the world performing web grabbing as a means of data collection for the company.
"Sometimes [it was done] manually, sometimes using tools and sometimes using third party utilities," says Saraiva. "We thought that there was enormous opportunity to rationalize the way in which that was done."
The challenges Saraiva faced were not uncommon. According to Matthew Jacobson, VP of sales and client services for Connotate, many companies have taken a manual approach to data collection. Others have even tried to use offshore labor to lower the price point, but both are still very expensive solutions.
"There's also huge costs in terms of accuracy. With that much manual labor, there's always accuracy issues. If you want to expand your coverage, it's very hard to do so because it's a linear cost increase," says Jacobson.
With a vision in mind of the direction that he wanted to take the RSA program, Saraiva says that the challenge was, to an extent, "how can we empower our content specialists who are normally not technical experts, and how can we empower them to automate some of their web content monitoring and sourcing needs?"