The Convergence of Big Data and Social Media

Jun 28, 2012

Once again "Big Data" is all the rage as enterprises struggle to cope with the data deluge that is exploding across the globe. To provide an illustration of this phenomenon, consider first that, according to Google's Eric Schmidt, there were five exabytes of data created between the dawn of civilization and 2003 (one exabyte is equivalent to one million terabytes). Now consider that, according to Cisco, by 2016 global IP traffic will reach 109.5 exabytes per month. Companies, organizations, and governments are all drowning in data and the bulk of what's contributing to this raging flood is user-generated content.

The tsunami of user-generated content has generated an urgent demand for more sophisticated analytical tools in the social media space. This rising demand is bringing the worlds of big data solutions and social media services (i.e. CRM, marketing, and sales) closer together than ever before. In the past two months the software behemoth Oracle acquired the social media marketing platform Virtue and Salesforce acquired Buddy Media. These two will certainly not be the last of such moves. In the coming months, other vendors such as IBM, EMC, and HP will likely make similar acquisitions. As a consequence, social media specialists, especially analytics experts, will need to become much more data-technology-savvy.

The merging of big data solutions and social media marketing platforms is not the only development that will require social media analytics experts to elevate their technology competency. Increasingly the rudimentary manual analytical methodologies of the past are proving inadequate for fully harnessing the power of Big Data to provide strategic insights for social media marketing campaign planning and measurement as well as social CRM, crowd sourcing and sales. At present, the most common approach to analyzing social media data is manual and protracted. It usually starts with a listening tool, such as Radian6, to gather the data followed by dozens of hours of labor-intensive data cleaning. The resulting dataset only yields insights about the relevant conversation - the major topics, the primary channels where the conversation is taking place and the identities of the individuals engaged in the conversation. Understanding the structure of the social network constituted by the relevant conversation requires additional analysis.

However, for those who have the requisite technology skills or can enlist the support of those who do, there is good news. Recent advances in network science provide two models that, provided the user can identify the contours of the target social network and map its structure, enable the user to identify the set of specific individuals within a network who are driving the conversation - the "influencers." These two models, developed by Yang-Yu Liu and Tamás Nepusz, respectively, provide methodologies for identifying influencers that are rooted in network physics rather than computational linguistics or platform interactions (Klout and Kred). Dr. Nepusz has even written a software program that implements both models and which can therefore provide deep insights into the structure of a target social network - again, provided the user has already defined and mapped said network.

Defining and mapping social networks can be extremely challenging given the current tools available to social media analytics experts. Tools that rely solely on Boolean keyword strings, such as Radian6, often yield massive quantities of data that, as mentioned previously, require dozens of hours of scrubbing. By the time the validation process is complete, there is usually little time left over for further analysis given the tight timing in the real-time world of social media. Fortunately there are now tools available which are able to automate the validation process by using technology that goes well beyond Boolean keyword search strings (Crimson Hexagon is one such tool, based on the research of Daniel Hopkins and Gary King at Harvard University). However even the dataset yielded from such a tool would still need further analysis and treatment before it would be ready for the network physics analysis described earlier.

This is one of many Big Data challenges that are awaiting solutions but one thing is certain: In the present environment, and for the foreseeable future, social media analytics experts need to augment their technology competency to remain effective. More than just being able to identify the best tools, they will also need to be able to recognize opportunities to utilize custom coding and development for deeper analysis and insight. Moving forward, harnessing Big Data, with technology that enables strategic insight, will drive the planning, execution and measurement of the most effective social media programs.