Big Gridded Data: The Transition from Legacy to Next Generation

This session, which was held in July 2019 at the Earth Science Information Partners (ESIP) Summer Meeting in Tacoma, Washington, explored several dimensions of technology and operational systems that support archiving, cataloging, distributing, subsetting, and processing of large structured data. For the session, large structured data was defined as any data with well structured spatial, temporal, band, scenario, ensemble, dimensions and associated variables that exceed practical size constraints of commodity internet and personal computing resources. Typical examples are very high-resolution geospatial grids, outputs from ocean, landscape, weather and climate models, and multi-spectral remote sensing archives. Use cases for such data range from meta and reanalyses that require run-time access to entire datasets at once to ad-hoc investigations requiring small subsets of one or more dimension. For example, a local science project may need a small spatial subset of an ensemble climate projection or a remote sensing research project may need to sample 100 point locations from a all scenes of a multispectral remote sensing product. Data formats and computing Infrastructure to support this range of use cases, from terabyte and greater data access to custom small-subset extraction presents a great challenge especially as technology changes and what was a sound implementation and investment becomes dated and unable to meet modern expectations.

This session featured speakers who manage operations and maintenance of archives of large structured data, build software and standards designed to meet the needs of a wide range of large structured data use cases, and researchers working to evaluate and demonstrate the potential of next generation technical solutions.

All presentations have been compiled into one file.