EContentmag.com Home
Search EContent:
25,000+ articles now available in ITI's new full-text digital archive: ITI-InfoCentral.com!
Visit ITI's Enterprise Search Center!
Newsletter
EContent Xtra
Research Centers
Content Commerce
Content Creation & Digital Publishing
Content Delivery
Content Distribution
Content Integration
Content Management
Content Security
Digital Asset Management
Fee-Based Information Services
Intranets and Portals
KM & Collaboration
Mobile & Wireless Content
News/Finance/Business
Online Community
Rich Media
Sci-Tech/Medical
Search Technology
Taxonomy
Web Services


Columns
After Thought
Agile Minds
Behind the Firewall
DisContent
Edit This
Eureka
Follow the Money
Guest Column
I Column Like I CM
Info Insider
Info Pro
Technology Watch

In Focus
EContent 100
EContent 100 Videos
Past Issues

Services
About EContent
Advertising
Subscribe to
EContent Magazine
EContent Xtra
Newsletters
RSS Feeds from EContentMag.comFeeds


Awards
2009 Apex
2008 ASBPE
2008 Tabbies
2008 Apex
2007 Tabbies
2007 Apex
2006 Tabbies
2006 Apex
2005 Tabbies
2005 Apex
2004 Tabbies
Metadata--Think outside the docs!
By Bob Doyle - Posted May 03, 2005 Print Version   Page 1 of 1

In the age of the intelligent search engine, the importance of metadata is called into question. It seems that Google can find everything we need by sending its robots to crawl around inside all of our documents. Why bother with the hard work of categorizing, classifying, and tagging each document with metadata that's stored outside the document in a database, or worse, buried in XML/RDF tag attributes in a stored version of the document that is rarely served as is, so the expensive metadata is never seen by today's search engines?


The world of librarians (now repositioned as Information Architects) keeps telling us that their categorizing skills are critically important to organizing information as knowledge. In their terminology, alphabetical subject lists (like the Library of Congress Subject Headings) and classification schemes (like the Dewey Decimal Classification) allow for "precoordinate" indexing of all the world's documents. Precoordinate means the search strings these library experts use to help us find what we are looking for are prepared ahead of time by teams of experts.

But in the age of Google our search strings are called "postcoordinate." We are all do-it-yourself reference librarians. This means we assemble our search strings as we think of them and query Google to see what comes back. If we're clever we use advanced search techniques and combine search terms with their version of Boolean logic. Mercury planet -car -element -god.

With all kinds of studies showing that postcoordinate searching is retrieving the right information for 80 to 90% of users, it often seems superfluous to invest the kind of money necessary to tag our docs with metadata to reach those last few customers. If you're a $100 million business, adding a few percent to revenues handily covers the cost of the really fancy taxonomy and metadata strategies. But even if you are a small operation, improving your clients' access to the information they need translates into customer satisfaction and quality of service. That may let you keep the business you have before an off-shore competitor spirits it away with a faster, better, metadata-enabled Web site.

MARC, Please Meet your Party on the Web
Metadata lets intelligent computer programs find the meaning of your content, beyond that discoverable by examining the documents themselves. Librarians called it the machine-readable catalog (MARC). Tim Berners-Lee calls it the Semantic Web. The question being asked by the financial types is when will commonly available Web tools exploit that extra meaning to deliver better information to your audience?

The short answer is: not soon enough to provide a measurable ROI, unless your Web site and intranet provide custom retrieval and navigation tools for your users. Don't invest in a huge metadata design and implementation unless you invest a comparable amount in your own search engine, or in a sophisticated adaptive navigation scheme that exploits the costly metadata with a user interface that your customers actually use.

The good news is that these tools can provide measurable results if they include complete logging of all the search and navigation efforts by your users. The bad news is that looking at the performance metrics may reveal that virtually no one uses your fancy new tools.

DIY Metadata
So what about the relatively inexpensive metadata provided for in <meta> the header element in every HTML page? Well-motivated Web page designers have augmented the visible document part of millions of Web pages (the stuff between the <body> tags is the real content of your document) with invisible <meta> information like keywords and descriptions.

You could use the <meta> tag to easily implement one of the most important but overlooked uses for metadata: search term explosion. To do this, you create a synonym ring, a list of terms that are essentially equivalents of the terms explicitly in your document, including abbreviations, acronyms, and even misspellings. When your visitor types in "tilenol," your metadata check tells the search engine to serve the page with Tylenol on it.

Easy to do and incredibly valuable when it retrieves what your visitors are looking for. Without it, you simply lose the business. But again, this metadata will be wasted unless you are in control of your search technology. Adding synonyms to your meta name="keywords" tag does not help with most public search engines. The problem is abuse of the <meta> tag by aggressive Web-page designers misrepresenting the contents of their pages to improve their search engine positioning. They have poisoned the metadata well.

Potentially poisoned or not, the bottom line is that metadata management is an integral part of sophisticated content management, but only if you control the complete user experience.


Print Version   Page 1 of 1
directory
»   Read the 15 minute guide to Enterprise Content Management
»   Read the 15-Minute Guide to Best Practices in Correspondence Management
»   ITIResearch.com - A collection of market research and reports for executive management and business & IT professionals
»   Publishers rely on Acquire Media's Syndication Suite to deliver content to target audiences with pinpoint accuracy.
»   Migrate Legacy Data – Register with Open Text for a FREE trial

All Content Copyright © 1998 - 2010, Online: a Division of Information Today Inc.
48 South Main St., Suite 3 · Newtown, CT 06470-2140
(203) 761-1466, (800) 248-8466 · Fax (203) 304-9300 · custserv@infotoday.com
PRIVACY POLICY