Inextricably entwined with the state of Big Data—that growing corpus of multidimensional information made larger each day via cheap sensors, cheap storage, and user-generated content—content analytics and its ability to pull actionable insights from digital content has never been under more pressure to perform. After all, having the highest volume of structured and unstructured data has never been the point; it’s whether sound business decisions can be made with that content that matters.
Growth projections for the content analytics market bear out its importance. A study from the research firm MarketsandMarkets predicts a growth rate for the search and content analytics market at more than 22% between 2016 and 2021, reaching $4.37 billion. It says that the major growth drivers are risk and compliance management, increasing use of advanced analytics and competitive intelligence, and the continued convergence of text analytics with Big Data.
Stephen Malinak, global head of persistent analytics and data science for Thomson Reuters’ Financial and Risk division, sees reason for optimism. “We’re making progress in three major areas of analytics: text, transactions, and the Internet of Things,” he says. “Organizations are getting better at breaking text analytics into classes of events because, just like numbers, text can mean different things in different contexts.” Malinak gives the example of how Thomson Reuters has trained its content analytics model for predicting bankruptcies completely differently from its model for sentiment analysis. “The bankruptcy model is tuned to find the phrase ‘going concern,’ for instance, because of the high correlation of that phrase in a filing and an eventual bankruptcy,” he explains, whereas the phrase has less import for high-frequency traders.
That sort of contextual understanding, industry insiders believe, is what will help move content analytics from its current state of providing mostly descriptive and diagnostic analytics to one in which it can deliver on predictive and, ultimately, prescriptive analyses that drive businesses forward.
The Year in Review
One of the biggest changes in the past year has been the growing role of distributed content—that is, the content organizations produce for and share on third-party platforms such as Facebook and Snapchat—and how to analyze it. John Saroff, CEO of web analytics company Chartbeat, says, “Over the last several years, distributed content has come to a crescendo. … Some 80% of users now consume news on mobile devices, and we know that what people consume on a webpage is different from what they consume offsite. We try to help publishers understand that social news gap.” Saroff mentions messaging apps such as WeChat as an increasingly important channel from which companies can glean user insights.
Companies are getting more comfortable with DIY analytics, combining market-specific knowledge with third-party tools and data. Sachin Kamdar, co-founder and CEO of Parse.ly, which provides web analytics and content optimization software for online publishers, says, “In the last year or two, there’s a shift from thinking of data as a commodity to ‘Hey, if we do the right thing in analyzing data, we could have new revenue opportunities.’” Recognizing this, in 2016, Parse.ly launched Data Pipeline, a service that offers real-time user-interaction data, enabling clients to build in-house analytics to leverage knowledge of their own markets.
Futurist Thornton A. May, whose work focuses on how companies create value with IT, sees some publishers as being way ahead in positioning themselves as analytics experts. He says, “Washington Post does spectacular work in this area, and they’ve become a very capable provider of content analytics solutions.” This should come as no surprise, since the newspaper is owned by Amazon founder and CEO Jeff Bezos.
Evolving business intelligence and machine learning tools continue to underpin another key trend: the move to data democratization, which puts data and analytics into the hands of the business user. Kamdar says, “Analytics end users these days can be in marketing, audience development, analysts, even content creators themselves.” That may be by necessity, as Kamdar adds, “In my work, I see that the roles most often unfilled at big publishers are for data scientists and data engineers.”
A Look Ahead
As tools and techniques for content analytics evolve, companies are getting more sophisticated in how they organize their data. Malinak says, “You’re going to see companies using Big Data techniques for reducing unstructured data to something more usable, and then applying traditional analytics.” Along these lines, Thomson Reuters has introduced the Data Fusion database, which uses a company’s Thomson Reuters-assigned Permanent ID to tie together disparate datasets. Malinak describes Data Fusion as “a giant graph database that takes everything you know about a company and allows you to see what connects to what. There are real insights when you can see where data matches up, and doesn’t.”
So much of Big Data is so young—think geolocation data, Twitter feeds, or Internet of Things sensor data from shipping containers. “A lot of new datasets are building track records as we speak,” says Malinak, citing a startup that is building a database of environmental events as well as other databases that track job listings over time. “When those datasets have 5 or 10 years of history, they will find new audiences.”
As investment in content analytics tools continues, there will be pressure to tie those investments back to defined business goals. Kamdar says, “Companies have to go beyond thinking about ‘vanity metrics’ like page views or social media shares. They have to define their business goals, then find the metric that tells that story.”
Properly valuing the contribution of content analytics requires support from the C-level and above, which May sees as a critical challenge. “Most companies have no analytical firepower on their boards,” he says. “They barely even have IT firepower.” But he also believes that “there will ultimately be a breakout over-performer in every market who is best at using Big Data and analytics,” and that will force entire industries to take notice. He cites Major League Baseball as a market in which every team now has an analytics department. Over time, “The MSA (Master of Statistics) will replace the MBA,” says May. “Great leaders will be great analysts.”