Digital materials are inherently fragile and need to be managed from the outset if they are to remain retrievable, identifiable and usable for the community that needs to access, use and reuse the information they contain. The set of activities required to manage data, known collectively as digital curation, aims to ensure that not only is the bit-stream maintained but that the data can be discovered and rendered throughout its lifecycle. Such lifecycle management ensures that documented policies and processes are developed, roles and responsibilities are defined, and the technical framework is in place to create, store and manage research data collections while delivering user access.
Drivers for lifecycle management of data
The science of archives and records management has long adopted a lifecycle approach to managing information. The imperative of an archivist is to ensure that information created in the course of a business or activity is adequately managed so that it can be identified, located and used when required, to support future activities. In the analogue world, paper or photographic materials may deteriorate over time through bad handling or poor storage conditions, allowing damp, mould, insect infestation or vermin damage to accrue. The context of their creation may be lost through divorce from their original environment, or poor documentation. The ability to find them may be hampered through inadequate cataloguing or misplacement. Capable management throughout the lifecycle ensures that these problems are minimized. The threats to digital material and the techniques for managing them may differ, but the underlying principles and the underpinning processes and policies, originally developed for dealing with the mountains of paper created in the pre-digital world, remain applicable to the digital paradigm.
The necessity of adopting the lifecycle approach to the management of data is discussed by Pennock (2007). Digital materials rely on a combination of hardware, software and storage media to create, store, access and render them. From the moment they are created they are vulnerable to the speed with which technology advances and the possibility of the failure of these technologies, so data can rapidly become inaccessible, or even completely lost. Additionally, the ease with which data can be moved, copied, edited and deleted may make its integrity and reliability questionable and its provenance dubious, with consequent repercussions for reuse. A lack of metadata may make it unidentifiable, irretrievable and unusable.