DITA Redux


      Bookmark and Share

BEST PRACTICES SERIES

Nearly two and a half years ago, the Organization for the Advancement of Structured Information Standards (OASIS) officially approved the Darwin Information Typing Architecture Standard 1.0, completing the transition from many years of software development at IBM, reaching back to the mid-1990’s before the introduction of XML.

How has this standard for structured writing, now at its first point release, become the fastest path to deliver documentation as XML? And what is the business model driving IBM’s continued support of the standard, with several employees providing the expertise to shepherd not only the standard, but its reference implementation of a free and open-source XML publishing system, the DITA Open Toolkit?

XML has established itself as the preferred technology for exchanging data between web applications. Now it has earned pride of place as a document markup language, its original purpose. When Tim Berners-Lee’s simple HTML exploded in popularity in the early 1990’s, XML was devised as “SGML for the web.” SGML’s flagship document application, DocBook, was converted to XML in 1999.

Originally, IBM designed DITA for online documentation, which was replacing traditional long-printed user manuals, written in DocBook or IBM’s proprietary Bookmaster. Recently, DocBook has been losing market share to the simpler DITA.

This can only accelerate, as DITA 1.1 has introduced a Bookmap specialization of the DITA Map that supports long books. Bookmap has many metadata elements for Front Matter (table of contents, figure and tables lists, dedication, etc.), Content Proper (including new parts and chapters), and Back Matter (index, glossary, notices, appendices).

A major automation feature in DITA 1.1 is the alphabetization of structures like glossaries and indexes into multiple languages. At present, this is a very costly step for localization projects, which offer the single largest return on investment in DITA XML technology. Translation can be done piecemeal as DITA topics are completed, without waiting for the complete book.

According to Norman Walsh, chair of the OASIS DocBook technical committee, specialization based on the object orientation and inheritance properties of DITA architecture was the greatest single advantage of DITA over DocBook. It is ironic but perhaps predictable that the major specialization in DITA 1.1 is a direct competitor for DocBook.

To add to DocBook and Bookmap, Adobe FrameMaker, a tool for long-form publications, introduced the DITABook. FrameMaker 8 lets DITA authors access the full power of FrameMaker’s built-in print publishing system, with tables of contents, figure and table lists, and indexes, plus pristine output to PDF that competitive authoring solutions can only achieve with expensive add-ons. By comparison, the DITA Open Toolkit produces lower-quality PDFs with relatively inflexible formatting.

Besides support for DITA, FrameMaker 8 shows that Adobe is serious about maintaining this 21-year-old desktop publishing software, alongside its InDesign replacement for PageMaker, and its team-based contributor tool InCopy, which together constitute a powerful automated publishing solution with XML.

DITA is now the fastest way for an organization to start delivering digital content as reusable XML content components. Many large organizations have developed their own DTDs and XSLT transforms to deliver XML content to websites on demand, personalized and localized, then assembled using XPath, XQuery, and XInclude techniques.

DITA now delivers that capability without the long time and considerable expense of DTD and XSLT development. And its unique conref (content reference) mechanism not only includes the reused component, it checks it for validation against the schema—unlike XInclude.

I have asked several IBM DITA specialists to explain the corporate rationale behind the sizable investment in personnel needed to maintain and advance DITA through standards development. Yes, many other companies are contributing as well, but IBM stands out. Is it just a proud parent protecting its offspring?

The best reason I have heard is one that might appeal to top corporate management: A major cost in mergers and acquisitions is the expense of converting the intellectual property of the acquired firm—that is to say, all its digital content.

If all that content is in a content management system based on single-source DITA XML, changing the corporate branding is as simple as changing it in all the reusable DITA topics. Then your automated XML publishing engine churns out a totally assimilated new division of your corporate empire.