I Column Like I CM:
Structuring Your Content

Jan 04, 2005

      Bookmark and Share

Structured content is easier to manage, but it may prove difficult to find the natural structure of your content. In fact, it is possible that not all content has a natural structure… some content may be doomed to a hopelessly invertebrate existence.

Many CMS vendors tout XML as the ultimate way to store content, especially for reuse of elements in different contexts or through different delivery channels. But what's an editor to do when reusable content elements aren't that obvious? The simple answer is that important structure is, in fact, pretty obvious, usually in the form of extensive repetition of distinct patterns. If you don't see these patterns a useful structure may not exist, in which case you will gain very little from XML-based content management systems.

Chapter & Verse
The first place to look for structure is in elements of the content that demand different font or style treatment. Something as apparently unstructured as a chapter of a novel or an email message has syntactic elements (distinguished by their arrangement). The chapter may have a title, or at least a chapter number. Different pages have page numbers and possibly decorative elements. Even something as simple as the body of an email has a salutation and a signature block, and the subject line is clearly a separate content element. A structured approach could present these elements with different styles.

Recognizing a syntactic structure will let you design a standard template for your content, with separate blocks for each distinguishable content element. This has been standard practice in desktop publishing for years because it lets the designers focus on the overall look, while the content experts concentrate on getting the words right.

Deeper and more significant structures are semantic with complex relationships. These elements may have profoundly different meanings, functions, and uses and their differences may not appear in the content as stylistic differences, so they are harder to find.

Something as simple as a press release has a date, perhaps a location, some quotes, a contact for further information, and other details. More importantly, the press release will be about something, a product for example. More often than not, you will know a lot more about a product than can show up in its press release. Products have properties like features, descriptions, images, prices, special offers, accessories, warranties, etc. and they will be associated with other things like the place of manufacture, the name of the product manager, sales contact information, and so forth.

Your content management system can manage this knowledge and bring in the appropriate information in a context-dependent way. But first you must break up your knowledge, chunk it, make it granular rather than continuous, and identify its associations or relationships, tagging them with metadata where needed. Each element of content is a candidate for "single-source publishing," which means there's just one place to go to make changes when they are needed.

Arts & Sciences
Structuring, chunking, and relating content is not easy work for content creators, especially those who see themselves first as artists and craftspeople, authors, and writers. It's like the yin and yang of art and science: The scientist analyzes, makes the distinctions, "murders to dissect" as William Blake puts it. The artist synthesizes, puts things (back) together, and creates the holistic.

Can a writer exercise his craft when his content has been broken into a dozen chunks that are reassembled on the fly as needed for a particular page? Structured writing can feel more like programming than writing, but the answer is yes--if the XML editor allows the content to be edited in-context. The writer wants to see his or her words surrounded by their neighboring words and images.

If a piece of content is to be reused, the CMS must let you switch easily to all the contexts where the words will appear. Can one sentence or paragraph function well in all these contexts? If not, derivative versions must be created, but they must remain linked to their originals still in use somewhere. When changes are required, the original and all derivatives must be reexamined.

Discovering the structure in your content is a problem for analysis. But if you have the right tools, you don't have to kill your text when you break it into pieces. The root of the word science is to cut, to make a distinction; the sci- is the same as in scissors. But the root meaning of the word art is to join, like the articulation of a human elbow or finger. Your job is to synthesize and rearticulate the text so it fits together smoothly while retaining the underlying structure that allows your content to be managed by your CMS.