Counting the Alpha, Omega, and Everything in Between


Not so long ago, a U.K. government department decided that it needed a new office. Over the next two years, specifications were written, architects competed, builders labored, and finally the administration manager was able to work out where everyone was going to sit. And that is when they realized that, since the specification had been written, the department had increased in size and would no longer fit into the space. Now of course, when designing the specification for your new CMS-powered intranet, you would never underestimate the quantity of information involved or other problems of migrating legacy content. Sorry, the fact is that in my experience, I find that a whole lot of people fail to pay sufficient attention to content migration. At a conference in Montreal a couple of years ago, an example was given where content migration took over eighteen months to complete because the CMS was available as a corporate license and so none of the due diligence on migration was carried out.

I'm writing this while the XXVII Olympiad is being held in Athens, so an athletics metaphor seems appropriate. Every athlete knows that no matter how well they run during the course of a race it is the final lap that matters most. Preparation for this final lap, and hopefully a place at the top of the winner's podium, starts at the very outset of training. This is how it should be in a CMS project, but in the excitement of the quest for the perfect CMS, the issue of how the content is actually going to get into the system for a site launch is often low on the list of priorities.

There are a number of reasons for this: The first is that there is a misplaced trust in the ability of software to solve the problem. Some CMS vendors offer applications software that can migrate content into CM systems, but you need to read the small print very carefully when considering these solutions. The same goes for third-party migration software, such as that offered by companies such as Vamosa ( and Nahava ( These products can make a very significant impact on the speed and ease of migration, but they come at a price and only work under certain conditions. There are other products on the market and more will emerge, but, regardless of their evolution, they—like taxonomic software—will continue to require human intervention.

The second problem comes in quantifying this human intervention. If the assumption is made that 200,000 pages need to be migrated to the new system, and that on average it takes two minutes to paste the content into a new template and then add metadata, then this equates to three person-years of work. In fact two minutes a page may well be an underestimate, especially when it comes to adding metadata (which may have been notably absent in the original intranet), checking links, and, above all, checking the content. This work cannot easily be carried out by contractors brought in to provide some keyboarding capability because of the need to know the organization and the content.

The third problem is that most organizations have very little idea of what content should actually be on the intranet and who owns it. One of the reasons for implementing a CMS is often to get a better fix, through administrative metadata, for the age and ownership of content. One client of mine had developed estimates from various sets of assumptions that had the page count of their intranet lying anywhere between 200,000 and 800,000 pages!

The management of migration presents a particular problem in a CMS implementation as most of the work cannot be undertaken until the CMS is stable and all the templates have been developed. Unfortunately, when a launch is scheduled for only a couple of months away, it is not the time to start working in person-years. The solution is not to wait until the end but to begin at the beginning. Two pieces of work need to be undertaken in parallel: The first is to develop a very clear view (and I find personas a great help here) of what content will be needed to support the key information-based business processes at launch. The second is to carry out a detailed (and certainly time-consuming) audit of the content on the site. (There is a very good case study by Chiara Fox of PeopleSoft available at and several papers at that will be helpful in this process.)

The reason for adopting this approach is that it will help you decide how many pages you actually need to migrate and may help make this onerous process more manageable. The content audit itself may well reveal 20% of content that is old or relates to departments that have long since had their locks changed. A laborious migration may turn out to be a total waste of time. Any failure to quantify the effort involved in legacy content migration could be career-threatening. So count carefully. It counts.