rich_context.pdf (18.31 MB)

Rich Context: providing support for cross-agency data stewardship, and measuring dataset impact on public policy

Download (18.31 MB)
posted on 13.01.2020 by Paco Nathan
This talk explores the Rich Context project based in the Coleridge Initiative at NYU Wagner, a public-private partnership, which leverages advanced machine learning to support cross-agency data stewardship and measure dataset impact on public policy. In particular we'll focus on perspectives from industry, such as open source projects based in Silicon Valley that are finding close corollaries and applications in government data management. The project also hosts a public machine learning competition that has engaged top AI research teams worldwide to address semantic harmonization problems in scientific communications.

Coleridge Initiative produces the ADRF platform, currently used by 15 federal, state, and local agencies in the US, to provide a FedRAMP compliant environment on GovCloud for data analytics. On the one hand, this helps analysts use sensitive data without having to work within an air-gap data facility. On the other hand, this assists data stewards at agencies to monitor data usage and provide support to their customers. The team partners with Deutsche Bundesbank for similar cross-agency work in EU, where they have pioneered a "data impact factor" metric for use with economic datasets (banking microdata) associated with the German central bank.

The Rich Context project intakes metadata from the agencies involved with ADRF to build a knowledge graph of metadata about dataset usage. Our focus for 2020 is working with NOAA to apply this knowledge graph work for the agency. Specifically, this focuses on coastal communities which use NOAA data for resiliency planning. We leverage machine learning to identify linkages between Earth science data and socioeconomic policy impact within local communities. The collaboration with NOAA is intended as a case study for other agencies to reuse, in support of the Federal Data Strategy and its Year-1 Action Plan.

This presentation was given at the Earth Science Information Partners (ESIP) Winter Meeting held in Bethesda, MD in January 2020.