Keys to KM: Selecting a Search System

In my last column I discussed the importance of selecting a single enterprise search system as part of a corporate Knowledge Management (KM) program. Here I'll look at taxonomies, categorization, product creep, and XML as further differentiators. Most companies have several search solutions—for their intranet, email, and other systems—because search came bundled with each application. Search was "free," so there was little incentive to purchase a single unified search solution. As the number of information repositories grows, however, searching becomes increasingly difficult. With bundled search systems, you can't query several repositories at once, and users must learn the ins and outs of searching each repository. Solving these problems requires selecting a single enterprise search system, with one decision criterion the search system's basic technology and its inevitable tradeoffs, flowing from the vendor's basic design decisions. Examples include ease of use with self-refining searches (typically statistically-based systems) versus precision and control in searches (typically a keyword and thesaurus-based system). Technology is only the first consideration though. Complicating the search selection process, KM specialists often drive the selection of an enterprise search solution, but also sponsor taxonomy projects. Taxonomies are useful for developing portal and intranet navigation, folder views for CMS, and keywords in search systems. Well-designed taxonomies can provide consistency and familiarity across enterprise applications.

So do you pick a self-improving search system or one designed to work with keyword attributes? Like fast food vendors supersizing their offerings, search vendors are growing their products beyond mere search to increase revenues and market share. Verity and Autonomy, using different search technologies, now both offer taxonomy management and categorization services. Categorization services, the flip side of taxonomies, assign documents into taxonomically-created sets of folders, add keywords for enhanced attribute searches, or suggest choices for subject matter experts.

Just as search vendors' products are expanding, database and content management vendors are beginning to offer search and categorization services. Documentum, a premier document and content management vendor, offers a product called "Content Intelligence Services." CIS currently works only within Documentum repositories, but could grow into a standalone enterprise service. And Documentum's chief search partner is Verity; how's that for a confusing search choice?

Lastly, you can't avoid dealing with XML, becoming both a lingua franca across systems and a type of content to be searched. KM best practices avoid reinventing the wheel by using industry-specific taxonomies, like Medical Subject Headings (MeSH). Taxonomies like MeSH are expressed in XML. Although most orga- nizations today have no XML documents to search, content-centric XML is on the rise. Microsoft gave that movement a boost with its recent announcement that Office 11, the next version of its Office XP productivity suite, will support XML.

Is there any good news in this quest for a single search solution? Even with their diverging system designs, search vendors are filling their product gaps, whether with supporting taxonomies or the ability to categorize existing content. And most are getting XML standards-based religion. Questions to help you narrow down a search solution include your enterprise architecture and support for standards. For example, what application does your organization value most: Search itself, content management, or databases? If content is king, then you might favor a content management vendor's offering for search and taxonomy services. If you have many heterogeneous repositories to search, you might emphasize a search vendor's product. As to standards, inquire about vendors' XML standards pedigrees: Are they members of the W3C, OASIS, or similar organizations? And if they are, which committees are they on and what is their membership level? How does their product exploit XML and associated standards? I asked Verity and Autonomy about the need for XML support in search systems. Verity said it is seeing a strong demand for XML information retrieval, since many content providers are increasing their XML offerings. Verity also said that its software uses XML technology to enable advanced search. Ron Kolb, director of technical strategy at Autonomy replied that although most current content is not XML-tagged, he sees "a growing trend to provide a tagging strategy so that applications themselves can do the searching."

Selecting a single search system is akin to solving a Rubik's Cube puzzle with the entire problem changing every time a small part of it is tackled. But—done right—unified search creates an indispensable layer in a KM foundation.