Making the Case for Text and Predictive Analytics

This is the third in a series of columns assessing the readiness of text and social analytics to play a productive role in enterprise solutions. In this column, I assess approaches to acquire analytics applications when dealing with a common institutional barrier: the chief information officer (CIO).

A recent MIT Sloan Management Review and SAS study showed that in the past 3 years, the number of those believing that data analytic capabilities confer competitive advantages has increased from about one-third to more than two-thirds. This refers to hard, structured data, which in my experience has always had an advantage in project funding competitions over softer content tools such as search. CIOs are generally comfortable with structured database information. So-called unstructured information is sometimes seen as a necessary evil to be supported and managed in the likes of SharePoint or document management systems. But times have changed.

Big Data includes all data across the spectrum. How do you make the case to CIOs that existing corporate search systems are inadequate when dealing with Big Data and its exponentially increasing three V's-volume, variety, and velocity? Getting funding for new technology solutions has never been easy. Twenty years ago, it took me a year of project planning on the IT side and evangelization with business units to gain approval for an enterprise search system. Today's environment is far more challenging.

For a moment, consider the CIO's dilemma. Budgets are always tight. Old systems never seem to die, and the CIO must support each of them. New applications-from mobile to mainframe-are requiring up to one-quarter of resources just to test them and assure they both work and do no harm to other applications. CIOs have their own visions and 5-year plans to keep the information infrastructure competitive and cost-effective. Most likely, the CIO you work with is thinking, "Why on earth can't we just make do with the search systems we already bought and maintain?"

Let's contrast text analytics and search based on my discussions with innovative vendors such as Lexalytics, Semantria, and Recorded Future. Traditional search is, essentially, a tool that lets you find documents containing keywords. Most searches produce thousands of results, not useful for contemporary sentiment analysis. Reducing the number of relevant search results requires complex Boolean acrobatics, a skill few businesspeople have mastered and fewer still have the time to perform. Moreover, differentiating between table columns and magazine columns is nearly impossible, since that requires semantics and understanding context. Hand-tagging documents can provide an edge but is too costly and unlikely today to even be possible. Existing search systems also have trouble coping with the three V's. According to Matt Kodama at Recorded Future, text analytics offers a way to scale up searching Big Data affordably, comparatively cheaply, and nearly in real time.

CIOs tend to be comfortable with traditional business intelligence (BI) systems and data mining reports from data-bases. Point out to them that text analytics provides text mining for unstructured content, the other 80% of Big Data. Moreover, text analysis systems often can work with traditional BI systems you may already have in place. Unlike most BI systems, text analytics can lay the foundation for predicting sentiment and performing social media monitoring. Semantria's Scott Van Boeyen says that Lexalytics (a joint venture with Semantria) is integrated with platforms from Oracle, MicroStrategy, and other common enterprise systems. Text analytics could improve systems already in place.

How easy is text analytics to use? Will it simply replace one set of difficult skills (Boolean) with another? Does analytics require finding and hiring rare and expensive data scientists? No, analytics systems can simply be an extension to traditional BI tools. Lexalytics' Seth Redmore asserts text analytics tools should be invisible, merely distilling information in ways that traditional BI can use. "When integrated with BI tools, text analysis becomes drag and drop. When integrated with social media monitoring, you're just getting a report of how you and your competitors are doing. You don't know there is text analysis happening at all," says Redmore. Besides, you want more than a rearview-mirror insight into what has already happened. Van Boeyen says that 40% to 50% of the predictive analytics and BI market already rely on text analytics for unstructured data. This trend can help you succeed in your analytics initiatives.