SourceWare: The Search Engine with Good Intentions


      Bookmark and Share

When Doyal Bryant recently took over as CEO of Market Central, Inc., a CRM portal company, he realized that he'd inherited a diamond in the rough. This particular gem was an "intent-based" deductive search engine that was "just sitting there" within the company languishing, according to Bryant. Just released to the general market, this gem has been dubbed SourceWare Search, and with the release, he's anxious to see if it shines. The company has begun selling licenses and believes the product will appeal to both the commercial and enterprise markets.

Yes, search is the hot topic this year, so what's so special about SourceWare Search? According to Bryant, by incorporating artificial intelligence, SourceWare Search goes much further toward providing searchers with that Holy Grail of the search industry—relevancy.

The average user enters 1.8 words into an initial query and gets back thousands of hits. Current search engines use various methodologies to try to bring the relevant hits to the top of the list, but none has eliminated the problem of excessive unrelated results. Algorithms used today by search engines basically function to arbitrarily guess at what the user had in mind. And while it's going a bit far to state that SourceWare Search can read users' minds, on a simplistic level, that's exactly what it tries to do. The product uses artificial intelligence (AI) to analyze the user's input for implied intent, and then it adjusts to that input. "It adapts to the user," says Bryant. "It can learn what the user is looking for."

Accepting input in natural language, including full questions or phrases (much like AskJeeves.com), SourceWare Search uses AI to measure "the intent of the questions versus the literal meaning of the individual words," says Paul Odom, Market Central's senior VP of software applications and solutions. The key market "differentiator," according to Odom, is that SourceWare Search uses a statistical algorithm that dynamically interacts with the user to provide a drill-down path that is a semantic network rather than a hierarchy.

With a hierarchical search, the user is presented with a list of topics, from which he or she can pick a subtopic, which then leads down to another subtopic. This narrows the field but not nearly enough, according to Odom. With a SourceWare semantic search, however, the user is presented with a list of topics but isn't limited to picking just one. He can combine topics, which will then lead him to another list of subtopics, which can be combined again. Thus the search proceeds not sequentially but exponentially. Hierarchical searching is like addition; semantic searching is like multiplication.

But perhaps as important as the way the SourceWare solution uses AI to search is the way it uses its AI to tag data. "It puts information away compactly, in a way that facilitates the search," says Odom, and it does it automatically. "All the customer does is put the tagging technology in gear and let it run," says Bryant. "It does the crawling stuff automatically."

The tool's proprietary XML-like tags are "far more compact and efficient than XML but are completely compatible with XML," according to Odom. It can be used either with data repositories that contain already-tagged data or ones that contain completely unstructured data.

SourceWare Search also has advantages from the perspective of the content owner/deployer (a target market) according to Bryant. He claims that content owners who switch from one of the current commercial search engines to SourceWare Search will save "80 percent of the cost of infrastructure build-out" (though the company had not yet finalized its pricing structure as of press time).

"Searching today is server intensive," says Bryant. "The big boys like Google need more than just server farms, they need server counties." But SourceWare's approach purportedly eliminates the need for more and more hardware to continually update and process new information. "Our tagging is one place, one time," says Bryant, no duplication of effort or redundancy. "Because SourceWare is zero latent that means you can add information without re-indexing," adds Odom. "The speed and efficiency of the tagging and the compactness of the tags means you need a lot less hardware," says Odom. "Fewer CPUs are needed to extract the data. You can get fast response with far less equipment." And the SourceWare Search architecture is scalable to the terabyte level, according to Bryant. "We're confident that this will scale as big as needed for quite a bit of the future," he says.
(www.marketcentral.com)