Architecture, Search, Integration: Classification Is the Common Denominator

Page 1 of 2

      Bookmark and Share

My earliest recollection of the chemistry laboratory at high school was the mysterious chart that hung high on the wall behind the teacher. Every one of its squares was filled with a two-letter abbreviation, such as Na or Cl. It was, of course, the Periodic Classification of the elements. In due course, I was introduced to the hidden wonders of the classification, with every element and its properties bearing a predictable relationship to others. Eventually I learned about the atomic structure of the elements, and just why each element had its unique position in the table.

It was the beauty and symmetry of the Periodic Classification that eventually led me to major in chemistry, all the while learning about other classification systems. One of them was the Library of Congress classification in the library at the University of Southampton, and here I began to realize that a book might not fit neatly into a particular section of the classification, which had problems understanding the differences between physical chemistry and chemical physics. Of course, the problems of classification do not just affect chemists, but physicists, biologists, botanists, zoologists...the list is endless.

I seem to have spent the rest of my life trying to classify things. Probably the ultimate challenge is to work with patent classifications, where in theory you are trying to classify inventions that are novel, and therefore should always fall outside a classification based on past inventions. However, with intranets I am finding that a substantial part of my work involves classification. You can't go to an intranet conference nowadays without every speaker referring to the importance of taxonomies, thesauri, and classification. The better the classification, the better the intranet. It is as simple as that.

Content Architecture
Let me start at the home page. Whenever I start to evaluate an intranet for a client, I look first at the headings on the home page. Before clicking on anything, I ask them to tell me what content lies behind the headings. Often there are headings such as Personnel, Forms, and Travel. "So where are the travel request forms? Under travel?" I ask. The reply is usually that they can be found under Forms. "What about expense claim forms?" They turn out to be under Personnel. You get the picture. Of course, when the intranet was first set up, there was very little content on it, so allocating headings was easy. As the intranet grew, new content was fitted wherever it seemed best, with the intranet manager relying on a search engine to find it if all else failed.

In my view, one of the most common reasons for low use of an intranet is that the top-level headings are not intuitive. A key rule in intranets is that within three clicks, you should be at least 85% confident (based on past experience) that further clicks will either locate the information you want, or reveal that the information is not on the intranet. Any wasted clicks going to the wrong headings eat into the three you have available to you. Putting content under headings that assist in reducing the number of clicks is essential.

The best way to arrive at an optimum set of headings is to work through the problem on paper. Give staff a set of cards, each with a subheading written on them, and ask people to put them in a box bearing the top-level heading. Then keep refining the headings until at least 80% put cards with the same subheading in the same box, and without thinking about it. Not only will you have evolved a workable classification, but staff will already be attuned to the new headings.

In time, these headings will need changing, and this is why content management software is so important. Without being able to redefine the classification and let the computer reallocate the pages to the new headings, there will be total chaos.

Searching an Intranet
This subject is worthy of a book, not just a hundred words. First, some definitions. A taxonomy is a classification, but a classification may not be a taxonomy. This is because a taxonomy is by definition a hierarchy. Implementing a taxonomy enables a search to be quickly broadened or refined by moving up or down within the taxonomy. A thesaurus provides (where appropriate) broader, narrower, or related terms for a given index word. Then there may be a controlled vocabulary, so that specific words or concepts are always indexed to the same word, so that "magazine," "newsletter," "trade journal," and "periodical" are all indexed under the term "periodical."

In addition, consider metatagging, so that there is a differentiation between (say) a report, a briefing paper, a memorandum, a case note, and a project summary. Department titles may need to be defined and tagged in such a way that when the department decides to change its name, all the items in the intranet databases that were originally owned by that department can be retrieved by both the old and new department titles, as appropriate.

The creation of classifications is both seriously important and seriously difficult. There are many vendors now that offer computer-based approaches to the development of taxonomies and classification schemes, including Semio, Quiver, SmartLogik, Inxight, Northern Light, and many more. The art of computer classification has improved substantially over the last few years, both in terms of quality and speed of development, but in my opinion there is still a substantial requirement for the skills of information professionals who have been taught the fundamental principles of classification, and also know the 'language' of the organization.

Page 1 of 2