Preserving the Digital Past with the Open Data Project


      Bookmark and Share

Article ImageWhat happens if the lights go out? It's a question that strikes fear into the hearts of content providers everywhere--especially those that have transitioned from preserving hard copies to digital storage. Decisions about how to protect content-in this case, digital content--should be based both on an assessment of the purpose and value of that content and the development of a policy that outlines what is stored and how it is stored.

Margaret Hedstrom is a professor in the School of Information at the University of Michigan where she heads the Open Data project, which provides graduate training for data sharing and reuse in e-science, and the Sustainable Environment-Actionable Data project, which explores how scientists can bring together massive, wildly varying data archives to work on environmental research.

Hedstrom points out that safeguards can be established to protect digital content from even the most significant disasters. "The current best practice is to replicate digital content in at least three places, preferably on different continents under different types of governance structures," she says. Unless content is printed and stored on paper as part of normal operations, it's not a process she recommends. "Printing and storing ‘hard copy' usually is not a cost effective option."

While those "in the business" emphasize the many benefits that digital storage provides-including low cost, ease of accessibility, and the ability to share and append or annotate information easily, there are still some who are hesitant to give up the security of a hard copy.

Anne Nicolai is a writer and editor and has been a public relations consultant and executive speechwriter for more than 20 years. She says: "I doubt I'll ever rely exclusively on the cloud or automate backup or even external hard drives to keep my archives secure. I make paper backups of everything critical. Paper documents can be lost or destroyed too, of course, so regardless of regulations, I'll use both digital and hard copy solutions for documents."

Andrew Schrage is editor-in-chief at Money Crashers, based in the Chicago area. It's a 100% digital publication, and Schrage says that preserving the accessibility and integrity of the information is critically important to him. "I am very concerned with the relative fragility of typical content storage methods such as CDs and external hard drives," he says. "It is much easier for such storage devices to become corrupted over time compared to paper documents." In addition, he notes, "Our reliance on servers and massive storage devices is greater than ever and information is very susceptible to device corruption or external hackers."

There is no question, says Adam Denenberg, that access is a major issue with the rapidly expanding amount of information that we are all exposed to on an ongoing basis. "I'm not sure anybody wants to go to the basement and sift through file cabinets to find that one document," he notes. The same considerations apply to video content, which is the focus of Denenberg's work with HuffPost Live, a new video platform recently launched by The Huffington Post. But, he adds, the larger concern may be security. "There's still that sense of physical security that you get with a hard document," he says, noting: "I don't think there's any bit of digital security that can 100% guarantee the security of your content. They're all crackable in some way or other."

Still, for most, the benefits of digital storage seem to outnumber the potential risks. And, say the experts, there are ways of minimizing risk and increasing the ability to experience benefits in a world where the cost of storage continues to decline.

The nostalgic notion that in the paper and print world print documents were well-organized and managed is no more true than the idea that everybody today carefully organizes their electronic materials and makes backups of everything, says Hedstrom.

The same issues and challenges exist in the digital world as in the traditional world. What has primarily changed, she says, is the sheer volume of information that we now have access to. "There are millions and millions of data collection devices that can stream data into a repository or into a place where it can be stored and looked through later, and selected." And, while there are clearly a multitude of issues around storage and data formats and technology options, Hedstrom says that that is not the place to start.

The most important thing that content providers can do to protect their digital information, says Hedstrom, is to have a policy and to stick to that policy consistently. "Be clear on who is responsible for making sure that the material persists and who will pay for it," she says. In addition, she notes, considerations about who owns rights to the material should be part of the process. "Not just the owner, but sometimes in order to preserve material you have to change its format, so having the rights to also make copies and possibly make changes that are necessary for preservation are important pieces of any kind of retention or archiving policy," she says.

"There seems to be a strong relationship between the visibility and use of information and the likelihood that someone is going to be concerned if it disappears," notes Hedstrom. "So making sure that people know this material exists and what they can do with it and how to find it is important."
When working with third parties, regardless of platform, Hedstrom recommends that content providers make sure the service provider can explain exactly how the content is replicated and how it will be restored in the event of an interruption of service.

"There are tremendous possibilities that we have for digital information that we didn't have in the paper world," Hedstrom stresses. "That is in particular the ability to pull different sources of information together, to present it to different audiences in different ways, to be able to reuse information and, now, to make it very easy for people to comment and redistribute it--it really is a different world." It is not a world without risk, but it is certainly one that holds ample opportunities and the potential for innovation for content providers around the world.