Date: October 25th, 2019 Time: 1:00-2:00 pm ET Location: GoToMeeting
With the growing emphasis on FAIR (Findable, Accessible, Interoperable, Reusable) science, science gateways need to enable scientific workflows conducted on them to be compliant with the FAIR principles. In practice, scientific workflows still involve a mix of non-reusable code, desktop tools, and intermediate data handling; presenting significant challenges to ensuring this compliance. In Earth sciences, the complexity of using diverse, large quantities of data from remote repositories compounds these challenges. For example, hydrologists and agricultural economists in interdisciplinary research need to not only use diverse data sets for their specialty domain but also connect their computational models through exchange of data. Data sources can range from repositories managed by NASA, USGS, etc.,to sensor arrays in smart cities, and crowdsourcing. Due to the inherent massive volume, high dimensionality, heterogeneous formats, and variability in access protocols, researchers often spend a lot of time manually collecting and processing data using custom code, instead of focusing on scientific questions.
GeoEDF, an extensible geospatial data framework, aims to lessen and possibly remove such barriers by creating seamless connections among platforms, data and tools, making large scientific and social geospatial datasets directly usable in scientific models and tools. GeoEDF is designed to abstract away the complexity of acquiring and utilizing data from diverse data providers. Extensible data connectors will implement common data query and access a variety of protocols supporting both static and streaming data. Extensible data processors will implement common and domain-specific geospatial data processing such as resampling, format conversion, or a scientific simulation model. A plug-and-play workflow composer will allow users to string together data connectors and processors into reproducible workflows that can be executed in heterogeneous environments. Automated metadata extraction and annotation will be integrated into such workflows, supporting FAIR science through ease of data discovery and reproduction. By bringing data to the science, GeoEDF will accelerate data-driven discovery. This presentation will provide an overview of the project, including its objectives, system design, scientific use cases, and anticipated outcomes.