Email: The Other Content Management


Email began as simple communication method, but has evolved into a mission-critical business application. Today, crucial business is being conducted via email including market analysis, project management, purchase information, customer complaints, internal memos, and even letters of agreement to name but a few. With all this information inaccessible to the rest of the enterprise, locked inside applications sitting on desktops or corporate email servers, email has emerged as a content and knowledge management issue. Beyond that, the regulatory burden around email continues to grow, with many companies (especially in the financial services) required by the government to save emails for a fixed period of time.

All of these factors (and others such as automated response systems that send email alerts) have resulted in steadily increasing email volume, creating storage and management problems for enterprise IT departments. According to a September 2002 Gartner, Inc. report, email has grown to become the second most popular communications channel after voice. According to Gartner, 11 billion emails were sent per day worldwide in 2001. That number is expected to more than triple to 36 billion per day by 2005 (and IDC reports even higher estimates of 31 billion emails per day in 2002 and up to 60 billion per day by 2006). While not all business email, these numbers project a staggering volume and IT departments will be forced to handle mounting storage and content management needs related to email if this trend continues.

Meanwhile, IT departments have other email issues to worry about including spam, denial of service attacks, virus control and profanity, liability and sexual harassment issues related to emails. Put this all together and enterprises are facing an ever-growing problem around controlling email on every level. This article attempts to define the scope of the email problem facing enterprises today and provides some examples of possible solutions.

Where's the Content?
John Blossom, vice president and lead analyst at Outsell, calls email "the neglected stepchild of information." "Everybody says it's important," says Blossom, "but very few people see it as a knowledge management source." Blossom says that the information inside emails has great potential that is vastly under-mined. Research by the Gartner, Inc. bears him out. A December 2000 report called Sizing the Emailbox: Less is More states, "Email has a central role beyond communication and coordination which has yet to be recognized by most enterprises—it is the forum for a high percentage of an enterprise's knowledge exchange. Estimates of the total knowledge exchange occurring vie email run as high as 75 percent."

Dan Carmel, vice president of marketing and business development of iManage, a content management software company, goes even further saying that email has evolved into the default collaboration platform inside organizations. Carmel says that email is being used to build group consensus and get commitments, which leads to what he calls "the email trap," where email is used to support collaboration, something he feels it was not designed to do.

A July 2001 IDC report, A Solution to the Tyranny of Email: 80-20 Retriever, supports both men's positions stating that, "Because email straddles the divide between a formal corporate information system and ephemeral messaging, it has something of a second-class citizen status within the enterprise. Organizations usually disregard the email trail that may one day be vital to understanding why decisions were reached, who was involved, and what they thought."

As an example, suppose four people are working on a market analysis and the collaboration is being conducted by email. Content such as documents, presentations, and information embedded in the message itself may go back forth. Once that project is complete, the information and content is stuck inside the email system. Analyst Blossom says that by providing the content locked away in the emails with a structure and a framework, it would allow people throughout the enterprise to realize that these four people have a level of expertise that others otherwise wouldn't have known about. iManage's Carmel agrees, stating that you need to "provide a contextual framework, a context for content" such as a common foldering environment where all email, documents, images, etc., always reside.

Retrieving Information from Emails
Email and other document such as word processing documents, presentations, and spreadsheets are referred to as unstructured data because they are not accessible as is the structured data found in databases. Ian Hersey, senior vice president of corporate development and strategy at Inxight, a company that creates enterprise information retrieval software, explains that we are faced with masses of unstructured text we receive in many ways, not just emails and word processing documents, but premium content such as legal cases from Westlaw or IEEE documents. Hersey says, "There is no markup associated with [this information], no card catalogue [because] people don't anticipate using this information later. The only way to bring order to this information is to create an automated process for marking information for later retrieval."

Outsell's Blossom agrees, saying that in order to take advantage of the information locked inside emails and other unstructured data, you need software that can crawl email, which is just what Inxight's SmartDiscovery product does. It begins by crawling the information, usually at a department level, to create an indexed data set, they call taxonomies—a hierarchical representation of a set of categories much like you find in an Internet directory such as Yahoo!. End-users search by keyword or category, then SmartDiscovery displays a graphical representation of the taxonomy. Clicking on a category in the graphic reveals results like those found in an Internet search engine.

iManage takes a different approach by providing what Dan Carmel calls "a single integrated content repository for email and business content." iManage WorkSite combines workgroup collaboration and content management. Carmel says, "you have to get control of this information and put it where it can be searched."

Regulating Email
Blossom calls email, "a horrible medium for financial transaction," yet business is being conducted in this fashion, and the government requires financial institutions to save emails involving financial transactions for a minimum of three years. To Carmel, it's just a natural progression. He says, "Regulators have caught up with reality and email needs to be managed and regulated like other content," but financial service companies appear to be having trouble maintaining email records. On August 2, 2002, The Wall Street Journal reported that six securities firms "were fined a total of $10 million for allegedly failing to keep emails and produce them in pending investigations." Carmel speculates that managers might have decided that it was easier to pay the fine than to manage the email, but it's not only financial services firms faced with this issue. Any company could be called upon to produce emails as part of a government investigation or as part of a legal discovery process.

The Enron scandal exposed what happens when companies have no archiving policy in place. There were widespread reports of virtual shredding of potentially damaging emails, but it didn't have to be that way. Greg Olson, chairman of SendMail, says that if a company has an archiving and deletion policy in place that it follows, the company would be excused from producing emails as part of an investigation. It therefore behooves companies to establish archiving policies and avoid these problems.

According to Blossom, it is not an easy task given the errant nature of email to easily connect them to financial records. However, SendMail's new Content Policy Console provides one way for companies to define policies for the company, a department, or even an individual. Using the Content Policy Console, SendMail's Olson says, it is easy to isolate financial transactions by creating a policy that checks all emails against a customer list and a broker list and anything that moves between two parties on these lists gets archived. Larger companies might work with more sophisticated custom technology that automatically assembles the customer/broker pairs from a database management system (DBMS) and checks them against email messages using the SendMail Copier product.

iManage works with third-party records management products that integrate into their WorkSite product. Carmel says that iManage's bases its information archiving approach on metadata. Once an email is declared a "record," you can no longer change it and it gets assigned a retention policy, that indicates how long you will maintain the email record before it gets deleted.

Formulating retention policy is a tricky matter because companies have to balance several competing needs including saving email as content for later retrieval, retrieving the disk space taken up by high-volume email traffic, and leaving too much information on the server that could come back to haunt a company (such as the now infamous "Internal Tidal Wave" emails Bill Gates authored in 1995 and that came out in the government anti-trust trial against Microsoft). Analyst Blossom sums up this problem by saying, "In business, you're not going to get ahead by being too paranoid, but you have to be paranoid enough to protect company assets."

Storing and Monitoring Email
A July 25, 2002 Gartner, Inc. report predicted, "Through 2004, enterprise mailbox volume will increase by 40 percent per year." According to Carmel at iManage, 80 percent of the content stored on email servers is taken up by attachments and half of those are exact duplicates. Faced with these numbers, it is no wonder IT departments are having increasing difficulty keeping up with the volume.

According to Carmel, iManage's WorkSite product eliminates attachments because all documents are stored in a central project repository. This reduces volume on email servers and Carmel says, replaces sequential editing (where people edit one at a time) with collaborative editing (where team members can edit together). Users also have the option of using a check-in/check-out approach to documents thereby ensuring that only one person is working on a document at a time. Either way, this approach saves time and disk space and eliminates many problems associated with version control. Although Microsoft Office provides some functions for distributing a document for editing, Carmel explains that this doesn't solve the problem of multiple versions of the same document floating around an organization, taking up valuable disk space and causing potential confusion about which version is the latest.

SendMail's Olson points out that exploding volume is not the only problem facing IT departments. They also need to control spam, develop strategies for virus prevention, and deal with email related to profanity or sexual harassment. SendMail builds what Olson calls a DMZ (demilitarized zone)—a safe area that exists between the network connections to the outside world and the connections to the company internal network. Olson explains that this gives you a central place where you can isolate problems and prevent a virus from spreading, head off a denial of service (DoS) attack, or route email to the correct place. For example, an email with a sexual harassment flag may be sent to human resources, rather than IT.

Olson estimates that spam comprises as much as a third of all email. He says, "You need to catch it and take it out before it costs money [in storage costs]." Olson points out that spam is difficult to control because it is not always easy to identify. Olsen says that, while everyone hates spam, most people think of a filter as on or off and companies need to balance their information needs with the desire to control the volume of spam.

Putting Email to Work
In spite of all the issues raised in this article, Outsell's Blossom says, "We are stuck with email for better or for worse. Email is the lingua franca of the Internet and it gets you in touch with few technical issues." Blossom believes that the next "killer app" will be something better than email. "Right now," he says, " the standard mime packet gets us from A to B."

While the solutions outlined in this article help resolve the myriad issues related to email, it may take a leap forward to change the way we communicate electronically. For example, in the future, Blossom sees a role for XML, where hooks and metadata tags tell the email application about the contents of the email. Right now, Blossom says, in it's current format it is difficult to transform email into information objects." Blossom says, "Unless you can change the message format, we will continue to do things the way we always have."

Greg Olson explains that over the last one or two years, people have come to realize that email is the way people communicate. "It's no longer just nice to have. It has become mission-critical." As such, Olson says, we need to find a way to manage "this amorphous thing called email."