Finders Keepers

Jan 03, 2006

Keeping Found Things Found is a multi-year project at the University of Washington Information School by Professors William Jones and Harry Bruce, with Susan Dumais of Microsoft Research. The team is studying the various ways people attempt to make interesting Web pages they've found easily accessible later. With $500,000 in National Science Foundation support, the KFTF team identified techniques like sending an email including the URL to ourselves or adding a link to a personal Web site, in addition to the more obvious methods like bookmarking a page, saving a local copy, or even printing it out and adding it to a huge pile of paper documents (likely never to be read again).

Bookmarking a favorite is the most common technique for making a Web page re-findable. The next most common behavior is to do nothing at all and expecting that the page will be found again in the future through a search engine (which is the most common way of finding things in the first place). Bookmarking is not only more reliable, it also allows for additional organization that facilitates finding favorite sites again. With a little bit of bookmark organization, you can arrange things in a taxonomy of folders and subfolders for finding them later. Arranging your stuff is formally known as Personal Information Management (PIM) or, by some, Personal Knowledge Management (PKM). Today we may recognize it as a problem in Personal Content Management (PCM).

When you come across a piece of content, there are things you can do to make it "yours" within copyright and fair-use limitations, in order to improve your ability to do your job or pursue your career or other interests. Top KM gurus like Jerry Ash, Tom Davenport, and David Gurteen all tout the power of PKM. Professor David Karger of MIT has developed Haystack, a powerful personal semantic-Web tool for organizing everything in your information universe.

But let's return to the relatively simple problem of making a Web page re-findable. It seems a problem crying out for tagging with a little metadata, the way lets you tag your bookmarks. Since bookmarking is the leading way to make a page re-findable, the added value of tagging, plus the participation in a community of taggers who share your interests, appears to be an obvious plus.

Even bookmarking has its foibles, though. (Don't worry, tags may help here too.) What if a bookmarked site disappears into a 404 black hole? You could save the page with an online service like LookSmart's FURL. The FURL toolbar can be added to your browser to provide one-click saving of a Web page to your FURL account. FURL offers some limited categorization tools, which let you share documents with others, but you will probably want your own categories. Or you could protect yourself by saving a local copy, using advanced tools like InfoSelect or NetSnippets to capture the page and add organizing metadata. Don't forget to add the original URL and the date you cached a copy so you can get the current version easily, if there is one. Some tools will do this for you. Then you can use desktop search tools from Google, Apple, Microsoft, or others to re-find your stuff.

As I mentioned, search engines are the most common way people find sites of interest in the first place, and often the place they turn to re-find those sites later. In their work on Keeping Found Things Found, Jones and Bruce asked test subjects what they called the "Google Question": "Suppose that you could find your personal information using a simple search (fast, effortless to maintain, secure and private). Can we take away your folders?"

Thirteen of fourteen respondents said "No." And the fact is that not only does Google want to help you organize your content, they want to know how you are doing it. But there must be a better way, especially for companies that want to keep their knowledge and content organization to themselves, locked securely behind their firewalls.

The new Memetic Web initiative proposes that you reuse your existing taxonomies or keyword lists by creating meme IDs with your own memespace and taxospace prefixes. You then simply tag a saved Web page with your meme IDs and they can be precisely re-found with any search or desktop search engine. Tagging with unique IDs from your existing knowledge organization scheme greatly increases the return on investment in such classification schemes. Knowledge workers in a corporate environment with a shared controlled vocabulary can leverage that vocabulary as they categorize found documents.

The folks behind Memography (I'm a principal, not incidentally) plan to develop tools for capturing Web pages and automatically inserting meme IDs, original URL, and date captured, to keep the found things as richly valuable as the original documents.

In the meantime, unless you are heavily armed with corporate and personal content management tools, you will just have to print out favorites--like this column--or send yourself an email with the URL to keep your content found.