Component Content Management

Mar 06, 2007

There has been a buzz lately on the mailing lists of the content management community about "Component Content Management." The discussion was provoked by a 2006 issue of the Forrester Wave on Content-Centric Applications.

During the discussion, information management expert JoAnn Hackos commented that many content management professionals are interested in topic-based authoring using DITA XML, which is not provided in a robust way by the major ECM companies that Forrester analyzed. Ann Rockley, president of the Rockley Group, pointed to a Content Management Technology white paper by Bill Trippe of the Gilbane Group called "Component Content Management in Practice."

In that paper, Trippe reported that, "despite the best efforts of the ECM vendors to provide an all-encompassing platform for enterprises, specialized vendors continue to focus on challenging, specific content management applications."

One important area, according to Trippe, "is the need for some organizations to manage large volumes of content that is used to support complex products before and after the products are sold. Examples include auto and truck manufacturers, airlines, and airplane manufacturers."

Core components of such content, he said, can be reused in many content products. Research shows that as much as half of product support content is redundant and could be reused. For a large organization, reuse could yield significant savings, efficiencies, and quality improvements over time.

What exactly is a core component? It can be something as small as a legal copyright statement, the first few steps in a process that are shared by many processes, or an important branding message like a product name or tagline. Reuse allows the core component to be edited and maintained in one place, and then be assembled into thousands of documents where it is needed.

And what is the topic-based structured authoring standard that Trippe, Rockley, and Hackos are all talking about? It is of course the IBM-developed OASIS standard called DITA (Darwin Information Typing Architecture).

Trippe's research was sponsored by X-Hive, a Netherlands-based firm that provides their Docato content management system to airline manufacturers like Boeing and other companies with massive documentation sets. They are a leader in DITA and in SD1000, a comparable standard for documentation reuse in industry and the military.

Is "component content management" something really new? Some list posters noted that major CMS's have done component content management for years, often calling it "single-source" publishing. Some even did it with a standards-based approach, like SGML or recently XML. Some did not, meaning that their data was locked up in the proprietary CMS.

For some, "component management" has become synonymous with "single-source" publishing. One industry leader is AuthorIT, which used "component content management" as their slogan on marketing materials for some time.

What's really new in all this is the DITA standard. If you go to the AuthorIT site today, you will see they have embraced DITA as one approach to their reusable components. DITA has standardized the idea of components into four major elements, topics, concepts, tasks, and references. These elements can be arranged in a DITA map, with more than one map corresponding to different output channels. As AuthorIT says, "You can combine the same topics in different ways. For example, the sequence of topics in a tutorial will probably be different than in a reference manual about the same product or service."

Another list-poster noted that not all content is suitable for "componentization." The best software and standards technology may simply not apply, for example to the content in magazine column like this one. The right content for single-source publishing can be broken up into small "chunks" that appear in many places. This is perfect for reducing the costs of translation and localization. If similar chunks can be identified, then rewritten to be identical everywhere, they need be translated just once, with enormous cost and time savings.

Rockley noted that the first requirement is a content strategy to develop structured content. If your content has a strong "granularity," it may lend itself well to single sourcing. Although you can do this without a standard like DITA, if you are starting out today, especially with documentation, the best strategy is to adopt a standard.

According to Hackos, the key thing is to differentiate "document management," where the CMS is storing whole documents, from those that store "components" of documents, preferably with XML tagging (metadata) describing the components. She said she wants "to be able to address elements inside a file for reuse or to affect conditional processing and so on."

Tony Byrne of CMS Watch (and my fellow EContent contributing editor) then announced that apropos of all the list discussions "about what to call the practice of XML/Single-source/Structured/Chunky/Component/Compound Content Management, CMS Watch and The Rockley Group have teamed up publish an evaluation of tools in that space." To be released in late spring, it will be called the "Content Component Management Report."

CCM it is. Consider it part of the ever-expanding CMS Glossary.

Related Articles

Four and a half years of columns, on top of a couple of years prior studying content management systems at CMS Review, taught me a lot about how information is created, managed, and published today, especially on the web.