Great Expectations for Gale's Nineteenth Century Collections Online

May 18, 2012


      Bookmark and Share

BEST PRACTICES SERIES

Article ImageCuration is one of the hot content buzzwords these days. From RSS feeds to Twitter and Pinterest, gathering together valuable collections of content is recognized as one of the best ways to attract an audience. However curation itself is hardly something new.

Be it in the form of a private collection of rare manuscripts to the massive assortment of images, documents, letters, and much more gathered at the grandest public institutions, curation - organizing and maintaining a collection -- has been around as long as there were things of value and places to keep them safe. Yet many of these collections sit in isolation, available to only a privileged few. They also are at risk of damage, deterioration, and even destruction.

It is with these issues in mind that Gale, part of Cengage Learning, set about to create its Nineteenth Century Collections Online (NCCO) collection. NCCO brings together rare primary source materials - monographs, newspapers, pamphlets, manuscripts, ephemera, maps, photographs and more - from more than 100 individual collections.

Working With the Collection Curators

According to Ray Abruzzi, associate publisher at Gale, while there were many challenges to a digitization project of this range and size, he found the individual collection curators to be incredibly receptive and cooperative. As Gale began to explore which collections it wanted to include, Abruzzi says he knew that "somewhere, there's a curator in love with each collection." Once that individual was identified, Gale staff would ask them "what would you like to see happen with this collection?" They found that in many cases the primary hope was to get it out into the world to be accessible to a global audience.

This was certainly the case for The National Archives. As Caroline Kimbell, head of licensing puts it, "By digitizing these records, The National Archives and Gale have opened the gateway for historians, undergraduates, post-graduates, and others worldwide to access the records, not just those previously who were able to visit The National Archives in person."

Increasing Information Access and Functionality

The majority of the content for which Gale partners with archives and libraries to digitize is rare and can often be unique, according to Abruzzi. In the past, the process to access documents such as these required not only travel, but also justification of a need to work with these materials. With digitization," he points out, "there is a democratization of research through broadened access."

In the past, use of the documents would not only have been highly monitored, but requests also took a great deal of time to process and access was limited to one researcher at a time. Through digitization, documents can be accessed by multiple users at once from anywhere in the world.  

The collection has also been enhanced with textual analysis tools including term clusters and graphics, which allow users to analyze individual as well as groups of texts.

For researchers and students, NCCO allows users to store documents, create annotations for their own personal use, and create and share tags for themselves and for other users. It has also been optimized for Zotero, a content and citation tool.

The Digitization Process

The combination of individual care, expertise, skill, and technological advancements was required to make this collection possible. As Abruzzi points out, the collections Gale and its global advisory board -- made up of leading international scholars and bibliographers of nineteenth century studies -- primarily sought out were "already well organized and complete unto themselves." However human subject indexing was combined with software and machine-aided indexing tools to enhance discovery. The collection includes over 2 million searchable terms that were keyed in, such as for the handwritten documents.

Abruzzi says customers started asking for a 19th century collection right after the 18th century collection was released in 2003. However, he says that "the sheer volume of publishing in the 19th century as compared to the 18th century was simply too massive to approach with the same comprehensiveness as ECCO."

"It wasn't until new technologies were developed and we explored new product platforms did we feel ready to take on NCCO and the 19th century," says Abruzzi. For example, Gale developed more robust metatdata that would work across the broad range of content types in the collection. It also improved usability of such a vast collection including the development of an "image viewer" that enables researchers to create their own customized view at the page level. Gale also added new ways to browse the collection including preview and flip through, so that users can more easily narrow long lists of results.

As Abruzzi describes it, "An undertaking of this size requires not just a large investment and lots of resources, but also many minds, sharing a vision." Gale partners with scanning companies to work onsite at the libraries, managing the scanning operations and coordinating the flow of materials within the library.

Kimbell says that luckily, much of the National Archives content was already on microfilm held by Gale, which is quick and simple to scan. However for the remainder, she points out that, "the variety of formats and conditions for other material makes the scanning process a 'craft,' with settings re-configured for different size items and different conditions." Her team has a great deal of experience and uses special equipment and handling techniques to avoid damage. They also have project conservators on site to work with scanner operators.

The Importance of Digital Archiving

For some of the institutions involved in this project conservation and repair were the primary reason they chose to participate as they lack the resources to do it on their own. Abruzzi says that "Gale is funding the repair and conservation of hundreds of documents-maybe thousands. The library gets back a document in better shape than when it came out of storage-and they get back a digital image of that document, meaning that in future when it is requested by a researcher they can serve up the digital image, saving wear and tear on the original and putting that document at less risk for deterioration."

The investment promises to be well worth it. As Kimbell says, "Providing records to a worldwide audience will encourage greater analysis and debate of the era in history that these records relate to. The greater number of individuals from a myriad of backgrounds can access the records, the greater the breadth of debate."

While Abruzzi says it's early to call out specific discoveries from the collection, he thinks that "the possibilities are mind-boggling. With over 10 million pages coming out each year in this program, researchers and students will not only make discoveries but create whole new areas and approaches to studying these topics. We saw this with ECCO and expect these materials to have a dramatic effect on the depth and volume of nineteenth century studies, teaching, and scholarly publishing."