Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-8kt4b Total loading time: 0 Render date: 2024-06-27T14:03:38.249Z Has data issue: false hasContentIssue false

5 - Enabling Reproducibility in Big Data Research: Balancing Confidentiality and Scientific Transparency

Published online by Cambridge University Press:  05 July 2014

Victoria Stodden
Affiliation:
Columbia University
Julia Lane
Affiliation:
American Institutes for Research, Washington DC
Victoria Stodden
Affiliation:
Columbia University, New York
Stefan Bender
Affiliation:
Institute for Employment Research of the German Federal Employment Agency
Helen Nissenbaum
Affiliation:
New York University
Get access

Summary

Introduction

The 21st century will be known as the century of data. Our society is making massive investments in data collection and storage, from sensors mounted on satellites down to detailed records of our most mundane supermarket purchases. Just as importantly, our reasoning about these data is recorded in software, in the scripts and code that analyze this digitally recorded world. The result is a deep digitization of scientific discovery and knowledge, and with the parallel development of the Internet as a pervasive digital communication mechanism we have powerful new ways of accessing and sharing this knowledge. The term data even has a new meaning. Gone are the days when scientific experiments were carefully planned prior to data collection. Now the abundance of readily available data creates an observational world in itself suggesting hypotheses and experiments to be carried out after collection, curation, and storage of the data has already occurred. We have departed from our old paradigm of data collection to resolve research questions – nowadays, we collect data simply because we can.

In this chapter I outline what this digitization means for the independent verification of scientific findings from these data, and how the current legal and regulatory structure helps and hinders the creation and communication of reliable scientific knowledge. Federal mandates and laws regarding data disclosure, privacy, confidentiality, and ownership all influence the ability of researchers to produce openly available and reproducible research. Two guiding principles are suggested to accelerate research in the era of big data and bring the regulatory infrastructure in line with scientific norms: the Principle of Scientific Licensing and the Principle of Scientific Data and Code Sharing. These principles are then applied to show how intellectual property and privacy tort laws could better enable the generation of verifiable knowledge, facilitate research collaboration with industry and other proprietary interests through standardized research dissemination agreements, and give rise to dual licensing structures that distinguish between software patenting and licensing for industry use and open availability for open research.

Type
Chapter
Information
Privacy, Big Data, and the Public Good
Frameworks for Engagement
, pp. 112 - 132
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Stodden, V., “Resolving Irreproducibility in Empirical and Computational Research,” IMS Bulletin, November 2013
Donoho, D., Maleki, A., Rahman, I. Ur, Shahram, M., and Stodden, V., “Reproducible Research in Computational Harmonic Analysis,” Computing in Science and Engineering 11, no. 1 (2009): 8–18CrossRefGoogle Scholar
Samuelson, Pam, “Preliminary Thoughts on Copyright Reform Project,” Utah Law Review 2007 (3): 551–571
Stodden, V., “Enabling Reproducible Research: Licensing for Scientific Innovation,” International Journal for Communications Law and Policy, no. 13 (Winter 2008–09)Google Scholar
Stodden, V.The Legal Framework for Reproducible Scientific Research: Licensing and Copyright,” Computing in Science and Engineering 11, no. 1 (2009): 35–40CrossRefGoogle Scholar
Reichman, J. H. and Okediji, R. L., “When Copyright Law and Science Collide: Empowering Digitally Integrated Research Methods on a Global Scale,” Minnesota Law Review 96 (2012): 1362–1480Google ScholarPubMed
Bitton, Miriam, “A New Outlook on the Economic Dimension of the Database Protection Debate,” IDEA: The Journal of Law and Technology 47, no. 2 (2006): 93–169Google Scholar
Sanders, A. Kamperman, “Limits to Database Protection: Fair Use and Scientific Research Exemptions,” Research Policy 35 (July 2006): 859CrossRefGoogle Scholar
“In Search of the Big Bang,” Computer Weekly, August 2008
“CERN Data Centre Passes 100 Petabytes,” CERN Courier, March 28, 2013
Bailey, D. H., Borwein, J., and Stodden, V., “Set the Default to ‘Open’,” Notices of the American Mathematical Society, June/July 2013
Stodden, V., Borwein, J., and Bailey, D. H., “‘Setting the Default to Reproducible’ in Computational Science Research,” SIAM News, June 3, 2013
Reichman, J. H. and Uhlir, Paul F., “A Contractually Reconstructed Research Commons for Scientific Data in a Highly Protectionist Intellectual Property Environment,” Law and Contemporary Problems 66 (Winter 2003): 315–462Google Scholar
Stodden, V., Guo, P., and Ma, Z., “Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals.” PLoS ONE 8, no. 6 (2013)CrossRefGoogle ScholarPubMed
Stodden, V., “Innovation and Growth through Open Access to Scientific Research: Three Ideas for High-Impact Rule Changes,” in Rules for Growth: Promoting Innovation and Growth through Legal Reform (Kansas City, MO: Kauffman Foundation, 2011)Google Scholar
Wiley, J. E. and Mineau, G., “Biomedical Databases: Protecting Privacy and Promoting Research,” Trends in Biotechnology 21, no. 3 (March 2003): 113–116Google Scholar
Citron, D., “Mainstreaming Privacy Torts,” California Law Review 98 (2010): 1805–1852Google Scholar
Berman, F. and Cerf, V., “Who Will Pay for Public Access to Research Data?Science 341, no. 6146 (2013): 616–617CrossRefGoogle ScholarPubMed
Callaway, E., “Deal Done over HeLa Cell Line,” Nature News, August 7, 2013
Morin, A., Urban, J., Adams, P. D., Foster, I., Sali, A., Baker, D., and Sliz, P., “Shining Light into Black Boxes,” Science 336, no. 6078 (2012): 159–160CrossRefGoogle ScholarPubMed

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×