Anyone who has been involved with replacing a content management software (CMS) application knows that what lies beneath the tip of the iceberg is the problem of migrating content. No matter how much preparation you do, it's not until the software is in and the templates and style sheets are written that the scale of the migration problem starts to become obvious. And intractable: Content authors are very unwilling to learn a new CMS and republish all the content they have created over many years in the space of a few weeks. To them, there is no business benefit; the pages look the same in the new CMS.
Compared to a CMS, installing a search application can be accomplished in a few hours, or less if you are using a Google appliance. The initial crawl may take awhile to complete but in perhaps no more than a few days, users are really pleased about the information they have found, especially the information they should not have found because no one has addressed the issues of secure search. Secure search is the "migration factor" of search implementation. Until the software is installed and all the documents are indexed, the scale of the problem remains unknown. With search, the challenge is even greater than for a CMS because you can at least tick the box as each page arrives at the new CMS. In the case of search, you may be faced with several million documents, and you won't live long enough to check each one.
The issues around secure search seem not to have been picked up by the information security community. I recently looked at a report from PricewaterhouseCoopers titled "Trial by Fire," in which the results of a global survey of 7,200 senior managers responsible for information security were presented. There was not one reference to the problems arising from search security anywhere in the report. Four out of every 10 respondents report that their organization has security technologies that support Web 2.0 exchanges, such as social networks, blogs, and wikis. In addition, approximately one-third (36%) audit and monitor postings to external blogs or social networking sites and 23% have security policies that address access and postings to social networking sites.
This indicates that the concern is about information going out electronically through the firewall, which is important but misses something very important. The most important information in your organization could walk out in the pocket of an employee.
Last year, I was working on the development of a global intranet strategy for a blue-chip high-tech company in Europe. To help me, special permission was obtained so that I could be provided with copies of PowerPoint files that summarized the strategic plans of the company. To say that these were highly confidential would be an understatement. But were they? The printouts that I was given had no security classification, no circulation list, and, other than the title page, there was no information given in the footer that indicated just how confidential these documents were.
It seems to me, based on a number of projects in the last 18 months, that information security managers have no understanding of the problems that arise from not having a formal policy about secure search. Ask your information security manager if he or she has a position on early binding versus late binding and see how quickly he or she can come up with a cover story. I've looked through some of the leading books on information security and, though they include pages on cryptography, secure search does not make it to the contents page. The focus seems to be on hackers and security of databases; the management of secure documents is nowhere to be seen, even to the extent of requiring basic circulation and security metadata to be added to each document and spreadsheet.
There are two main reasons why secure search will derail your search implementation. The first is that the RFP you have sent out to vendors will either omit the topic or not provide enough information for the vendor to assess the problem. The second is that, with the installation and initial crawl complete, the scale of the problem becomes very obvious as terms such as "confidential," "relocation," and "redundancy" are tried out.
Of course, the problem is not just at the document level. The search logs will quickly show who is searching for "workplace bullying." But are they the victim, a friend of the victim, or someone trying to see what they can get away with? Next time you do a search on an issue related to product development, industrial relations, or acquisition targets, just take a few seconds to look at each result and ask yourself if the document really should be on general release.