EContentmag.com Home
Search EContent:
25,000+ articles now available in ITI's new full-text digital archive: ITI-InfoCentral.com!
Visit ITI's Enterprise Search Center!
Newsletter
EContent Xtra
Research Centers
Content Commerce
Content Creation & Digital Publishing
Content Delivery
Content Distribution
Content Integration
Content Management
Content Security
Digital Asset Management
Fee-Based Information Services
Intranets and Portals
KM & Collaboration
Mobile & Wireless Content
News/Finance/Business
Online Community
Rich Media
Sci-Tech/Medical
Search Technology
Taxonomy
Web Services


Columns
After Thought
Agile Minds
Behind the Firewall
DisContent
Edit This
Eureka
Flexing Your Content
Follow the Money
Guest Column
I Column Like I CM
Info Insider
Info Pro
Media Redux
Screen Play
Technology Watch

Departments
Case Studies
eReader
Faces of EContent
Peopleware

In Focus
EContent 100
EContent 100 Videos
Past Issues

Services
About EContent
Advertising
Subscribe to
EContent Magazine
EContent Xtra
Newsletters
RSS Feeds from EContentMag.comFeeds


Awards
2009 Apex
2008 ASBPE
2008 Tabbies
2008 Apex
2007 Tabbies
2007 Apex
2006 Tabbies
2006 Apex
2005 Tabbies
2005 Apex
2004 Tabbies
Applied Semantics: Making Meaning Matter
By Marla Misek - June 2002 Issue, Posted Jun 01, 2002 Bookmark and Share Print Version   Page 1 of 1

Profiled: Applied Semantics
www.appliedsemantics.com
Co-Founder: Gilad Elbaz
CEO: Jordan Libit
Number of Employees: 34
Founded: 1998


We've all been there: You enter a word or phrase into the "search" field of any major search engine, and instead of netting a targeted list of hits worthy of further exploration, you're inundated with page after page of sites featuring one or more of the words you entered, regardless of their meaning, context, or relevance to the search you're conducting. A common problem, it's one professional searchers have grudgingly learned to live with and work around.

It's also a reality that many content management solutions providers are working to change. Among them is Los Angeles, California-based Applied Semantics (née Oingo), a software developer whose mission is to empower businesses to better organize, manage, and retrieve digital information in Web-enabled, enterprise, and ecommerce environments. The innovation of two California Institute of Technology graduates on a quest to make computers more "human-literate," Applied Semantics today offers a product suite of enterprise solutions that, in the words of Co-Founder Gil Elbaz, "help knowledge managers extract more value from their content and save money" in the process.


Circa 1998
At the heart of the Applied Semantics product line is Conceptual Information Retrieval and Communication Architecture (CIRCA), a communication platform that is scalable, language-sensitive, intelligent, and refreshingly accurate in making information locatable. The proprietary technology is based on an extensive ontology consisting of millions of words, meanings, and their conceptual relationships to other meanings in the human language. Thought to be the world's largest database of general knowledge—with more than 1.2 million words, half-a-million concepts, and tens of millions of relationships—CIRCA matches words and phrases to its ontology, performs linguistic analysis, disambiguates them into meanings, and weighs those meanings by importance, thus making computers more effective in managing and retrieving information. (For example, the word "java" would be recognized as an alternate name for coffee, an Indonesian island, and a computer language.) "CIRCA is about figuring out what a document is really about," Elbaz explains. "Unlike typical search engines, like Verity, Google, or AltaVista—which retrieve information based on the exact string of text [the user enters]—CIRCA maps words in the document with concepts in our ontology. Once we have a representation of what the document is about, we can then summarize and categorize it." Indeed, the soul of CIRCA is its ontology, which Applied Semantics has built and updates continuously in three ways. In addition to employing a team of 15 lexicographers and computational linguists who manually add information to the database, the company gathers data through a process called mechanical ontology expansion. "Basically, we crawl significant chunks of the Web [using proprietary algorithms] looking for patterns of repetition," says Elbaz. "You can actually derive the relationships between objects and terms in this manner. Finally, we license data via free public databases and other specialized sources…and purchase data for customers who want specific vertical knowledge bases built into the ontology."

The company itself originated with Elbaz and Co-Founder/CTO Adam Weissman, who launched Oingo in 1998 with the purpose of "focusing on unstructured information," Elbaz recalls. "We were trying to create a meaning-based search engine that would be based on a new way to store and represent knowledge. We did, in fact, successfully launch a search engine that continues to run today." (Oingo.com conducts meaning-based searches across 15 broad categories and hundreds more subcategories, including arts, business, computers, health, news, reference, shopping, and sports. It continues to be powered by CIRCA, and is operated by Applied Semantics' Naming Solutions division.)

"As the market shifted, we wanted to take the CIRCA technology and apply it to specific enterprise solutions," Elbaz continues. Since changing the company's name to Applied Semantics in May 2001 in an effort to better reflect Oingo's altered business model, Elbaz and Weissman's team has targeted the publishing, pharmaceutical/biotechnology, and financial services industries with three principle products:

  • Auto-Categorizer, a plug-in to existing data-management technologies that automatically assigns documents to a predefined or customized directory to improve knowledge mining and retrieval;
  • Page Summarizer, which deciphers the meanings of documents and provides customized, accurate summaries to improve the knowledge discovery process; and
  • Metadata Creator, a plug-in to existing search technologies that adds automated metadata to improve knowledge discovery.

Available separately or in a package solution, Applied Semantics' enterprise tools use the same XML language to communicate results back to the user. Enterprise customers include the Smithsonian Institution and QwestDex Direct, which recently acquired a database of more than 2.3 million businesses from a third party in an effort to increase the list inventory available in its DotComDirectory.com business database. In order to quickly integrate those entries into its existing inventory, QwestDex needed an automated solution that would classify each of the 2.3 million Web sites to one or more of its 4,500 yellow page headings. Enter Auto-Categorizer, which was able to map the 4,500-topic taxonomy in four days.

Circa 2002 (and Beyond)
"One of the problems for content managers is getting the content in a format that will allow other people to find it," Elbaz laments. "One of the main ways that is done is by putting the right metadata on it, whether it be through categorization or appropriately summarizing the information. What we're doing, essentially, is providing that metadata for them."

"Early content-management solutions are mostly about the publishing process and the fundamental needs of managing documents," adds Steve Bernstein, general manager of the Enterprise Solutions division. (Bernstein, former vice president of product marketing for Inxight, joined Applied Semantics last October.) "Questions of concern in those days were, 'Where do we keep the documents?', 'Who determines when a document is complete?', and 'How do I manage version control?' The next step in that hierarchy is categorization, or 'How do I create information about the documents that will be relevant in the future?' Once you have a content-management system, the workflow, and the conversion aspects squared away, then it's about retrieval and understanding patterns.

"You can't be content-free," Bernstein continues. "You have to have an understanding of how topics relate to one another culturally. That's the only way, and it's the right approach to solving the vast number of problems content managers face."

Those problems include properly addressing the components of knowledge management that Elbaz says most content managers often neglect—namely, the "middle steps" that allow them to manage their content more effectively. "We are just in the infancy of trying to get computers to work with language in such a way that they're deriving actual meaning from a document and doing something intelligent with that document," he notes. "Once we get the core technology down, I'm looking forward to improving voice recognition [and its relationship to] datamining, monitoring chat groups, and automatic learning. There are so many interesting applications yet to be explored."


Bookmark and Share Print Version   Page 1 of 1
CURRENT ISSUE

Subscribe today!
directory
»   Read the 15 minute guide to Enterprise Content Management
»   Read the 15-Minute Guide to Best Practices in Correspondence Management
»   ITIResearch.com - A collection of market research and reports for executive management and business & IT professionals
»   Need instructional help with Online Video? Check out our new resource - OnlineVideo.net

All Content Copyright © 1998 - 2010, Online: a Division of Information Today Inc.
48 South Main St., Suite 3 · Newtown, CT 06470-2140
(203) 761-1466, (800) 248-8466 · Fax (203) 304-9300 · custserv@infotoday.com
PRIVACY POLICY