A Toolkit for Reproducible Big Data Analytics in the Cloud.pdf (309.15 kB)
Download file

A Toolkit for Reproducible Big Data Analytics in the Cloud

Download (309.15 kB)
poster
posted on 10.01.2022, 20:02 by Xin Wang, Jianwu Wang
We present our open-source RPAC toolkit that supports 1) on-demand distributed hardware and software environment provisioning, 2) automatic data and configuration storage for each execution, 3) flexible client modes based on user preferences, 4) execution history query, and 5) simple reproduction of existing executions in the same environment or a different environment. This presentation was given during the 2022 ESIP January meeting held virtually in January 2022.

Funding

National Science Foundation (NSF) Grant No. OAC–1942714

National Aeronautics and Space Administration (NASA) grant No. 80NSSC21M0027

U.S. Army Grant No. W911NF2120076

History