EContentmag.com Home
Search EContent:
25,000+ articles now available in ITI's new full-text digital archive: ITI-InfoCentral.com!
Visit ITI's Enterprise Search Center!
Newsletter
EContent Xtra
Research Centers
Content Commerce
Content Creation & Digital Publishing
Content Delivery
Content Distribution
Content Integration
Content Management
Content Security
Digital Asset Management
Fee-Based Information Services
Intranets and Portals
KM & Collaboration
Mobile & Wireless Content
News/Finance/Business
Online Community
Rich Media
Sci-Tech/Medical
Search Technology
Taxonomy
Web Services


Columns
After Thought
Agile Minds
Behind the Firewall
DisContent
Edit This
Eureka
Flexing Your Content
Follow the Money
Guest Column
I Column Like I CM
Info Insider
Info Pro
Media Redux
Screen Play
Technology Watch

Departments
Case Studies
eReader
Faces of EContent
Peopleware

In Focus
EContent 100
EContent 100 Videos
Past Issues

Services
About EContent
Advertising
Subscribe to
EContent Magazine
EContent Xtra
Newsletters
RSS Feeds from EContentMag.comFeeds


Awards
2009 Apex
2008 ASBPE
2008 Tabbies
2008 Apex
2007 Tabbies
2007 Apex
2006 Tabbies
2006 Apex
2005 Tabbies
2005 Apex
2004 Tabbies
The Relevance of Recall
By Martin White - May 2008 Issue, Posted May 02, 2008 Bookmark and Share Print Version   Page 1 of 1

In early February, I had a most enjoyable lunch with four people that I have known most of my professional life. At a rough guess, our combined service in the information profession was the better part of 150 years. We met as the selection committee for the Tony Kent Strix Award.


Tony Kent, who died in 1997, made a major contribution to the development of information science and information services both in the U.K. and internationally, particularly in the field of chemistry. The award (www.ukeig.org.uk/awards/tonykentstrix.html) is given in recognition of an outstanding contribution to the field of information retrieval. Our group reflected on the extent to which the foundations of information retrieval were laid more than 40 years ago. For example, there is currently much interest in faceted navigation from companies such as Endeca and Siperian, the basis for which is the work on library classification schemes by the Indian mathematician S. R. Ranganathan in the 1930s.

At the heart of assessing search engine performance is the concept of relevance—a word that dates from 1733. Much of the early work on relevance in an information retrieval context was carried out in the 1950s and has been the subject of research ever since. However, if you listen to the assertions of certain search vendors, you wouldn’t think this was the case. One recently told me, when I queried the lack of any indication of relevance on the results from their search engine, that I have an old-fashioned view of search.


It is generally recognized that users are unwilling to go beyond 30 results (usually three pages) unless they see a good reason for doing so. The value of a relevance ranking, be it a percentage or a "star" graphic, is that it provides an indication of the point at which the long tail of largely irrelevant search results starts. If after 30 hits the percentage relevance is still around 90%, a user realizes it’s time to change the search strategy, either by using different keywords in the Basic search box or using an Advanced Search option. On the other hand, if the percentage is already dropping to 70% by the end of the first page of results, the user can feel reassured that clicking on further pages is not going to be of value.

The usual reaction to my concern about any lack of relevance indication is that Google doesn’t do it. One of the reasons for this (and there are others) is that with such large result sets, a relevance ranking would not be very useful. Not so with enterprise searches. Incidentally, I am also tired of people telling me how quickly a search with Google is completed. There is confusion about the difference between speed with which the results are returned and how long it takes to work through them to find the best information. When you do a Google search on a broad topic, time how long you spend working through the result set to find useful results. You may be unpleasantly surprised at how slow your search for information really is.

Another important facet of relevance is recall. In a web search, recall (the percentage of all relevant documents returned) is not of great value. However, in an enterprise setting it can be very important to be as certain as possible that you have found all relevant documents. In a court of law, or even in a meeting with your manager, worrying that the search engine may have missed something is not a good feeling. To achieve high recall requires a lot of dedicated work by the search team. And I mean team. Too often search is relegated to someone in IT on a part-time basis. In most sizeable organizations, there needs to be a search manager, someone doing serious and sensible analysis on search logs, another person who understands the formal and informal taxonomies of the organization, and someone on the help desk. That’s a total of four people, and it can’t be done well with less in the way of resources.  The challenge is especially high with search engines using semantic/statistical search, where tuning can become a nightmare.

It is easy to be dazzled by search technology and by vendors who create the impression they are inventing new approaches to search usability. Instead go to www.dcs.gla.ac.uk/Keith/Preface.html and read the standard text on information retrieval by Keith Rijsbergen (published in 1979!) and gain a real insight into the basic principles of effective search. You might also send the link to your search vendor so that it can get a taste of good old-fashioned relevance.


Bookmark and Share Print Version   Page 1 of 1
CURRENT ISSUE

Subscribe today!
directory
»   Read the 15 minute guide to Enterprise Content Management
»   Read the 15-Minute Guide to Best Practices in Correspondence Management
»   ITIResearch.com - A collection of market research and reports for executive management and business & IT professionals
»   Need instructional help with Online Video? Check out our new resource - OnlineVideo.net

All Content Copyright © 1998 - 2010, Online: a Division of Information Today Inc.
48 South Main St., Suite 3 · Newtown, CT 06470-2140
(203) 761-1466, (800) 248-8466 · Fax (203) 304-9300 · custserv@infotoday.com
PRIVACY POLICY