Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-xfwgj Total loading time: 0 Render date: 2024-06-29T04:17:19.225Z Has data issue: false hasContentIssue false

13 - Very large data collections

from PART III - MANAGING METADATA

Published online by Cambridge University Press:  08 June 2018

Get access

Summary

Overview

This chapter concentrates on aspects of retrieval and management that are particular to big data. This book originally set out to consider metadata about documents and document collections, using a wide definition of documents to include images, sound, museum objects, broadcast material, as well as text-based resources such as books, journal articles and web pages. Social media activity has been included in this, because it involves a permanent (usually text-based) record of social interactions or online behaviour. The type of metadata associated with each of these types of big data will vary considerably, as will the use to which it is put. Transactional data has largely been excluded from this scope, unless those transactions relate to documents. This chapter also describes linked data, an approach that expands the scope of data sets enormously, because it provides a mechanism for combining data sets from different repositories or collections – mediated by the internet.

The move towards big data

The move toward big data has been driven by increasing storage and processing capacity, the establishment of standards for exchange of data and the requirement of funders to make research data more widely available. This last factor is based on the idea that publicly funded researchers should make their data available for further exploitation. It is also driven by regulatory factors such as those that apply to the pharmaceutical industry. Criticism of clinical trials data focuses on the selective nature of publication, with the tendency for some pharmaceutical research companies to publish only data that favours their products, the phenomenon of ‘missing trials data’ documented by Ben Goldacre (2013) in his book Bad Pharma. The US government now requires all clinical trials to be registered according to Section 801 of the Food and Drug Administration Amendment Act, which came into force in 2017. The registration includes details of documents and data sets arising from the clinical trial including:

  • Type

  • Definition: The type of data set or document being shared.

  • • Individual Participant Data Set

  • • Study Protocol

  • • Statistical Analysis Plan

  • • Informed Consent Form

  • • Clinical Study Report

  • • Analytic Code

  • • Other (specify)

  • Type
    Chapter
    Information
    Metadata for Information Management and Retrieval
    Understanding metadata and its use
    , pp. 203 - 220
    Publisher: Facet
    Print publication year: 2018

    Access options

    Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

    Save book to Kindle

    To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

    Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

    Find out more about the Kindle Personal Document Service.

    Available formats
    ×

    Save book to Dropbox

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

    Available formats
    ×

    Save book to Google Drive

    To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

    Available formats
    ×