Using Docker images to promote cross-language (Python, R) collaboration across diverse user platforms for cloud computing
In any organization there is a limited resource of time available to install complex toolchains for scientific computing. Not all science staff who could use complex toolchains have the expertise or IT support to successfully install these toolchains. One result in our organization is that complex geospatial computing set-ups are effectively limited to advanced users who are able to install (often by a process trial-and-error and debugging) the necessary toolchains. Users spend excessive amounts of time upgrading toolchains while still always being out-of-date and out-of-sync with other users. IT staff time is limited and too many requests for installation help overwhelms IT support ticket systems. Toolchain installation presents a major barrier to learning new languages (e.g., R to Python). To deal with these issues, we are exploring containerization of computing environments. With containerization, a single image of a complex compute environment is developed and user can run that in many different ways. Installation time is the same for one or 1000 users. For our agency, containerization has the potential to accelerate adoption of more complex and multi-language (R-Python) toolchains, reduce the installation burden on scientific and IT staff, and promote reproducibility.