Complex Search @ Your Web Service

Page 2 of 3


A Different Kind of Search
Traditional search produces a set of results, but the search tool only looks at text and does not have the ability to manipulate the results. Schireson says, "The first major limitation of traditional search is that it really looks at documents as being text, and most content has some structure associated with it, which is ignored in traditional search engines." This structure is the metadata associated with the result, and by taking advantage of this metadata, users can achieve finer control over the results.

"Traditional search engines," says Schireson, "just return a URL and a snippet of text." He says a Web Services query allows you to be more specific about the query because the markup can be more finely grained, and as a result, the search results can be all that much more precise. "Our server allows you to manipulate and render content to find the most relevant paragraph on a given subject, and return not just the URL of the book where it occurred, but return the finest-grained sub-section around that paragraph," he says. This approach allows the user "to understand the information in context and where it fits in their overall work."

As an example, Schireson says one of his customers, science and medical publisher Elsevier, was looking for a way to better understand its vast repository of data and target the results to the needs of the individual user, to present them in context of need. "Elsevier is the world's largest scientific, technical, and medical publisher. Their big asset is a bunch of content." Elsevier executives wanted to know how "to maximize the value of this content to the company by maximizing the value to a customer in any individual interaction," says Schireson. "There is an imperative for them to develop new products that target users specifically. Well, it turns out that the most time consuming part of that is assembling content behind it. What they've done using our technology is built a repository in XML, which they can then use to develop new products."

Schireson says that Elsevier has built a Web Services layer on top of its content repository, and new products are essentially an HTML application that talks to the Web Services layer. This gives the company a lot of control over the search, what results will be returned, and how these results are going to be presented.

By building search as a Web Service in this fashion, search becomes a platform on top of which companies can build HTML applications that provide more concrete and specific ways to get at the data. According to Gilbane, "By using the search engine as a platform, users have a much more lightweight and feasible way to get their hands around large amounts of information. The advantage Mark Logic has," Gilbane continues, "is its XML structure and granularity. So if you are building an application on top that is going to use Mark Logic as a service to feed it, then you can build some sophisticated metadata into the database in advance."

Search Engines Can Play Too
It's not just vendors like Siderean, Mark Logic, and Ipedo that are using XML to enhance the search experience. With increasing frequency, Web search vendors such as Yahoo! and Alexa are building Web Services interfaces to content and inviting programmers to build corresponding applications. In fact, Allen of Siderean says that whether you are talking about his Seamark product or Alexa exposing its tools to developers, it's all about accessing needed information and finding ways to use that data beyond the initial results. "It's this aspect of treating results as a resource . . . and eliminating the overhead that people had to wrestle to get that search information into a useful form." He says that "things like the Alexa engine that was put out, and the A9 initiative to extend RSS as a carrier for search information . . . make it simple for a developer to grab and augment a given application, where the query is a way to filter down to a broader set of things, something that a developer is trying to build into a workflow or a decision-making process." And, he concludes, "the closer we get to delivering results easily in workable forms, the closer we get to this notion of search as a Web Service."

Jeremy Zawondy, technical lead at Yahoo!, says although it made its APIs available for some time, the company did so only through specialized business relationships involving search syndication. About a year ago, Yahoo! began offering access to search features as a Web Service as a way to give developers an idea of what kinds of features were available from Yahoo!. He says, "We didn't have a way to let the general population of Web developers or smaller emerging companies plug into what we're doing and make it available to a broader set of end users and developers." The Yahoo! Developer Network, according to Zawondy, gives developers access to Yahoo! offerings such as its Web Search and Photo Search. They also have a term-extraction Web Service that Zawondy says developers have been using as a way to tag content on the Web.

According to Zawondy, developers use these services in many different ways. For instance, a company called Rollyo helps users build a small customized vertical search engine. "The idea is, I can go to Rollyo and build a list of trusted sources, then provide a customized search box and put it on my Web site, and anyone who visits my Web site can conduct a search across those sources. In effect," he says, "they are tapping into my knowledge and the sources that I trust in order to get the information for whatever topics they are looking for.

"There's a whole other class of applications that fall into technology demonstrations, show-off, or mash-up applications where people are taking our search interface and building fun things or new visualization tools, things that aren't standalone products but are still demonstrations of where things could be headed if we decide to go one direction or another with next-generation products," Zawondy continues. "One of those I've seen, someone took some RSS feeds from Yahoo! News and used our search interface to Yahoo! Image Search, and provided a new way to navigate news stories on Yahoo! by building a navigation scheme where you navigate using pictures."

Alexa, a service owned by Amazon that provides Web statistics data opened up some of this data as a Web Service about a year ago under the name Alexa Web Information Service. Among the services Alexa offers, according to Niall O'Driscoll, VP of engineering at Alexa Internet, is a service that provides a way to tag a collection of content and then, based on this, build a set of trusted search results (much like what Zawondy describes for Yahoo!). O'Driscoll says one developer in Germany is running a music site using the Alexa Web Service that enables visitors to search for music by melody. The Alexa search gave the developer access to midi files and he was able to extract this information and build a database of music files and present a unique melody-based search engine.

Page 2 of 3