Editor’s note: In January 2023, the White House Office of Science and Technology Policy launched the Year of Open Science to advance national open science policies across the federal government. During the year, ARM is publishing a series of stories on work to advance open and equitable research. Max Grover of Argonne National Laboratory and Monica Ihli of Oak Ridge National Laboratory provided the following post.
On April 26, ARM hosted the first webinar in a series to educate the ARM community on how to access and use its new computational resources.
The ARM Data Center has been working on starting up the ARM Data Workbench, an interactive computing environment that can connect to state-of-the-art computing resources to work with ARM’s 30 years of climate research data.
Monica Ihli, a software developer at Oak Ridge National Laboratory, and Max Grover, a software developer at Argonne National Laboratory, presented at this webinar, covering Jupyter Notebooks, how to log on to the Data Workbench, and how researchers can use free open-source software installed on the workbench to execute their science workflows.
The webinar was recorded and is available on ARM’s YouTube channel. The key points are also summarized in this blog post for those interested!
The Jupyter Notebook: A Community Standard
The first topic of discussion was an overview of what a Jupyter Notebook is.
Jupyter Notebooks have become the standard interactive computing format within the open science community. Much like physical notebooks one might use in a classroom, these notebooks contain text notes and equations, as well as an execution environment, which allows users to run code next to their notes. This enables users to document what they are working on and reproduce the exact code they used to obtain their scientific results!
Rather than ARM’s software development team building its own custom interactive environment, the ARM Data Center utilizes free open-source software from the Jupyter ecosystem, including JupyterHub, which is a place where people can build and execute Jupyter Notebooks. JupyterHub is a part of the Data Workbench.
Monica walked people through how to use their ARM login to access JupyterHub. Once on JupyterHub, she opened a Jupyter Notebook and executed a few cells, showcasing how easy it is to start exploring scientific questions related to ARM data on the workbench.
Elevating Your Experience on the Data Workbench
While all ARM users are able to access the Data Workbench, additional privileges can be accessed by applying for elevated resources. The key benefits here include:
- persistent project space
- scalable resources, including more computational power
- integration with the Data Discovery interface, allowing users to order data to the workbench
- access to the full archive of ARM data, which includes 30 years of observations.
If you would like to apply for this elevated Data Workbench experience, you can do so using these instructions in ARM’s Knowledge Base.
Open-Source Software: Installed and Ready for Science!
ARM supports not only the computational environment and data used in this webinar, but also free open-source software that empowers our scientific community. Two key packages detailed in the webinar are the Python ARM Radar Toolkit (Py-ART), which is focused on analyzing weather radar observations; and the Atmospheric data Community Toolkit (ACT), which helps users work with meteorological time-series observations and provides support for a variety of ARM data sets.
While Max did not give a tutorial on how to use these packages, he did cover how to execute Jupyter Notebooks using these tools on the workbench. The integration of the software, data, and computational resources enables users to easily create and reproduce complex scientific workflows.
The default computational environment on the workbench includes Py-ART, ACT, and a suite of other useful packages. For a full list of the software installed on the system, be sure to look at this Knowledge Base article.