Published online by Cambridge University Press: 09 June 2018
It is common these days to speak of sustainable resources in areas such as energy, forests, fisheries or water. Sustainability is crucial because these resources are the raw materials that fuel economic growth and prosperity. Libraries, archives and other stewardship organizations also manage resources that we hope will be sustainable over the long term. These resources are the raw materials of research, learning and creative expression. The long-term sustainability of information resources is not a new problem, yet it is one that is complicated by the fact that an increasing proportion of the scholarly and cultural record is now manifested in digital form.
Digital research data, along with other digital products of the research process (e.g. digital laboratory notebooks) has emerged as a significant component of both the process and output of scientific inquiry. Microsoft researcher Jim Gray identified the development of a ‘fourth paradigm’ of scientific discovery, marked by the application of intensive computing resources and sophisticated computational techniques to massive datasets (Hey et al., 2009). Support is gathering around the proposition that creating and sharing an important dataset – one that could catalyze new strands of inquiry – should be considered a first-order scientific contribution, on a par with a published article or book. The importance of datasets is evident even in the realm of commerce: Hal Varian, Chief Economist for Google, has observed that successful businesses will be those that can mine their data for useful intelligence to inform business decision making (Levy, 2009).
As discussed in the previous chapter, the importance of research datasets is reflected in the emergence of policy measures designed to secure their long-term persistence. For example, the National Science Foundation (NSF) has imposed a requirement that grant applications include a data management plan, the merits of which would be considered as part of the overall evaluation of the application. In addition, a number of repositories, such as the UK Data Archive, the Inter-university Consortium for Political and Social Research (ICPSR) and Dryad have been established to provide longterm curation capacity for valuable research data.