Get It Together: Integrating Data with XML

Page 1 of 3

The awareness that enterprises have useful content stashed away in various repositories throughout the organization (rendering it far less useful) has begun to seep into the consciousness of organizations everywhere.

Whatever the source of the data, it is becoming increasingly evident that companies must find a way to leverage all of their information and even to get disparate data to work together. Thus, more and more, enterprise IT departments must find a way to integrate data across different platforms from a variety of data sources. Many companies have been using XML in one form or another as a data integration tool for sometime, but as we move forward, systems and data integration becomes increasingly important—whether dealing with disparate internal databases, combining systems after a purchase or merger, or communicating with customers, suppliers, and business partners over the Internet. Above all of this hovers the potential of the Web Services model, which uses XML to move data across the Internet with the grand vision of creating a massive, integrated network moving information from company to company without regard to platform.

Microsoft, which has been heavily focused on Web Services development with its .Net initiative, recently announced that its next version of Office would include XML support and also announced a toolkit that allows programmers to write code that communicates with XML-enabled Word and Excel templates, which offers the potential to bring XML to the desktop. Further, the United States government is poised to undertake one of the largest computer systems integration projects ever attempted with the formation of the Homeland Security Department, combining the computer systems and data from a mind-boggling 22 different departments. All of this is going to require XML to grease the data integration engine. When you put it all together, after years of discussion, XML appears ready to step to the forefront as the data integration method of choice.

The Beauty or the Beast
XML, eXtensible Markup Language, bears some similarity to HTML in that it uses tags. But rather than a fixed set of tags that define how content looks, XML allows you to define your own system of tags (within a defined format) to identify data elements—to control what content does. This creates a powerful system for moving data across an enterprise and across the Internet because it places the data in a separate layer.

John Halamka, chief information officer for CareGroup Healthcare System, says, "XML provides an abstraction layer. I don't care [which software]; name your vendor. I can use XML however it's represented and put it into my standard interface and expose it to other systems. Halamka says that instead of trying to design a proprietary interface to process the data, you can represent it in a standard way using XML.

Paul Grabscheid, chief information officer at InterSystems, a company that makes database and applications development software, says of XML: "It provides a way, regardless of hardware platform and other issues, for information to be exchanged between systems. XML resolves the battle over which format the communication will take place in and I think that's a valuable thing."

The downside of this flexibility is that each industry needs to develop its own standard way of identifying data. John Evdemon, a digital strategies consultant with Booz Allen Hamilton working with XML integration for government initiatives (notably the Department of Homeland Security) and editor of the XML Journal, says that the flexibility itself can be overwhelming. "One of the positives of XML might also be construed as a negative because it's so flexible. An organization just getting involved might be overwhelmed by the shear number of vocabularies out there."

"If you're just trying to integrate two systems," says Evdemon, "you can use any representation [tags] you like," but if you are trying to exchange information across companies, both organizations have to come to an agreement about what the tags represent. "Without agreement, it's difficult to communicate and align the data."

CareGroup's Halamka agrees, using his medical industry as an example "Until the standards get fully refined, we are going to develop one set of tags and Stanford [University researchers] may develop another. When we need to collaborate, we are going to need middleware—software that helps each system understand the other's vocabulary—to communicate."

Evdemon explains that another issue is that some initiatives are not quite mature enough for a large-scale multi-enterprise integration. He uses Wall Street, an industry that relies on split second timing for accurate pricing, as an example. He says, "XML is not concerned with size. If you're modeling a system that depends on system speed or size of device, in those cases you don't want to look at XML because you have to carry additional tags with information about data" (which inflates files size). Yet, in spite of these issues, Halamka says XML "provides a great opportunity to exchange information in a standardized format where no other format exists."

Page 1 of 3