Stronger together: the case for cross-sector collaboration in identifying and preserving at-risk data

, mayernik@ucar.edu; , rdowns@ciesin.columbia.edu; Duerr, Ruth; , hou@ucar.edu; , natalie@cos.io; , nancy.ritchey@noaa.gov; Thomer, Andrea; , yarmel@rpi.edu

doi:10.6084/m9.figshare.4816474.v1

DataRescueRefugeresponseMar2017.docx (19.27 kB)

Stronger together: the case for cross-sector collaboration in identifying and preserving at-risk data

journal contribution

posted on 2017-04-04, 18:24 authored by mayernik@ucar.edu, rdowns@ciesin.columbia.edu, Ruth DuerrRuth Duerr, hou@ucar.edu, natalie@cos.io, nancy.ritchey@noaa.gov, Andrea ThomerAndrea Thomer, yarmel@rpi.edu

In the past few months, a range of grassroots initiatives have gained significant momentum to duplicate US government agency data. These initiatives are inspired by recent reports that scientific data and documentation have been removed from government websites, and by concerns over US budget proposals that slash scientific budgets [1]. National media outlets have reported on numerous "data rescue," "data refuge," and "guerrilla archiving" events that have taken place around the US and in Canada during the past few months [2]. Many of these events have focused on creating copies of Earth science data generated and held by US federal agencies. These activities have attracted hundreds of volunteers who have spent considerable time and energy working on duplicating federal data.

Early connections have been made between the rescue volunteers and the federally-funded data community; these conversations have highlighted some of the different perspectives and opportunities regarding agency data. The two goals of this document are to provide the perspective of Earth science data centers holding US federal agency data on this issue, and second, to provide guidance for groups who are organizing or taking part in data rescue events. This paper is not a how-to document, and does not take a position on the political aspects of these efforts. Given the extent of the US government data holdings in the Earth sciences and other domains, it is inevitable that any grassroots data rescue will have to make strategic choices about how to invest their efforts. This document is intended to describe considerations for data rescue activities in relation to the day-to-day work of existing federal and federally-funded Earth science data archiving organizations.

The authors use the ‘data rescue’ terminology throughout this text to connect with the stated goals of the grassroots ‘data rescue’ communities, though we do wish to push back on the assumption that the data being targeted by these efforts are necessarily in need of ‘rescue.’ As we discuss below, many of these data are, in fact, well managed and safe, though sometimes in ways that are less-than-obvious to someone new to the domain. We look forward to working with these communities to develop a shared sense of risk for federal data.