The Siren Song of Structure: Heeding the Call of Reusability

Page 1 of 3

      Bookmark and Share


So you just attended a fantastic content management system software demo that showed you several wonderful approaches to obtaining greater value from your content. With this new CMS, the press release that you compose in Microsoft Word for distribution to the media could automatically be uploaded into your epublishing system and appear on your public Web site in a matter of seconds. The release headline would then be immediately and effortlessly syndicated to partners. And what if the contact person for the press release changes tomorrow? No problem: in your new CMS, all your contact information is stored separately from the body of the documents, so the name of your new PR person would be magically updated in all of your old releases.

If, while envisioning this new CMS-empowered world, you are experiencing a feeling like the cavalry has just arrived, it's with good reason. Powerful tools can help you "liberate" your ever-growing store of content from its original context and format. CMS can enable the ikind of reusability that gets you closer to the goal of true multichannel publishing.

Multichannel distribution is certainly music to ears of any Web publisher. I call this music the "siren song of structure," because all the benefits of reusability depend on being able to structure your content carefully and sustain that structure consistently. If you underestimate the challenges of creating and maintaining structure, you risk getting dashed on the rocks of yet another under-utilized CMS implementation.

Giving Structure Meaning
Definitions of "structure" vary wildly, and just posing the question to a CMS or XML software marketing person may actually result in a rare moment of silence. Nonetheless, most specialists end up defining structure as an abstract notion—something that you superimpose on content to create:

  • consistent organizational patterns
  • addressability
  • predictability
Organizing information into discrete chunks to reveal (and enforce) commonality is what allows you to re-use those pieces in separate channels, formats, and output methods later—thereby obtaining more value from them. For example, you have a type of content called "press releases" that is fundamentally different from your "case studies." A case study type always contains particular elements, such as "introduction," "product name," and so forth (others use different words for type and element).

Addressability is critical for both humans and systems to use those discrete elements. The application of labels to elements allows you to say, "I'm talking about the second paragraph," or "Let's only put our headlines into syndication."

Of course, this all implies a substantial degree of consistency and predictability. "Predictability means different things to different people, especially in terms of detail," according to Lisa Bos, vice president of consulting services at CMS consulting firm Really Strategies, Inc. (RSI). "Many firms make the mistake of defining too much detail," Bos adds. Mandating that the body of your press releases will always contain four paragraphs certainly makes them more predictable, but also less flexible.

Many analysts talk about structured content in terms of corporate information presently residing in databases, the traditional repository of "chunked" content. This definition implies that everything else—including the reams of text content on desktop computers and Web servers—is therefore "unstructured data."

However, much of your non-databased content is in fact highly structured, or at least semi-structured. It's just that the tools to leverage the component elements, as well as the general understanding about the value of maintaining that structure, have been relatively immature or limited in application to niche-oriented communities—until recently.

Making a Case for Structure
The CMS vendors who demonstrated all the cool things you could do with your press release weren't just making a pitch. It's true: Being able to decompose your content into constituent elements can indeed multiply its value.

Not surprisingly, traditional publishers have been among the first to reap the benefits. Network World magazine's "Fusion" Web site employs Percussion's CMS package, which manages content as XML documents and fragments. Fusion executive editor Adam Gaffin is clear about the value of breaking down news stories into their atomic bits and sending those elements out to different places. He says, "Granularity is good; it helps us auto-generate syndication feeds, wireless editions, and email newsletters with embedded headlines." Granularity also enables you to put different elements of a page into separate workflows.

Of course, publishers typically work with highly structured content types. Gaffin concedes, "We had it easy—we know what a news story is." For other industries, particularly those new to electronic publishing, the task of divining the inherent structure within a large trove of content can be highly challenging.

Giving Content Structure
If consistency and predictability represent two key goals, then you need to think carefully about the structure you apply to your content, and the identification of content contributors who are responsible for individual elements. "This is perhaps the most important piece of design work you do," argues Dan Ryan, senior vice president of marketing and business development at CMS vendor Stellent.

The practice of content analysis is part science and part art. However, nearly all specialists agree that you need to start with the actual goals of your epublishing effort, then work one-on-one with business users to identify discrete components and the level of detail required to meet specific business objectives.

RSI's Lisa Bos is a nationally-renowned content analyst. She offers her technique, "Analyze from two directions at once: the content creation side and the output side—what kind of functionality do I need to support?"

"Structure should be based on purpose rather than what the content inherently is," notes Vernon Imrich, CTO of Percussion. "Don't ask what the content elements are, ask what the individuals andthe organization as a whole want to dowiththat content," Imrich advises. Business drivers usually start with the required outputs: wireless, syndication feeds, Web pages,PDF, and others. You may need to isolate different elements from the same content type for each of those formats. For example, syndication feeds typically require a short description that might not appear on your Web page or PDF version.

But since it is the authors who really know the substance of their content, you face a delicate balancing act when it comes to granularity. Most organizations tend to initially overcomplicate their structural formats, which can be overwhelming for content contributors and editors heretofore unfamiliar with working with atomic snippets. After soul-searching on what level of granularity is required to achieve a competitive advantage, many organizations inevitably accept some compromises, like allowing the mingling of HTML presentation tags within larger elements. Or maybe there's no business value for you in isolating and storing your press release subheads separately from their headlines. "The sooner you can be honest with yourself about what levels of detail are important, the sooner it becomes a much more manageable task," RSI's Bos concludes.

Page 1 of 3