This ENG will continue to capture and finalize the proposed ARM Data Workbench (ADW) concept requirements. Based on the initial needs gathered from a set of users and stakeholders, ARM data service staff developed a data workbench concept and presented the idea during the 2019 and 2020 ARM PI meetings (https://asr.science.energy.gov/meetings/stm/posters/abstract/2509). The ARM Data Workbench (ADW) is a revolutionary way to interact with the vast amount of data ARM offers. Utilizing parallel processing frameworks, data stores, and NoSQL technologies, ADC developed an initial version of backend workbench infrastructure and successfully used it in the LASSO bundle browser and a few other use cases. As part of this proof-of-concept, we now have access to over 50 select datastreams readily available on this platform used in test cases.
Once developed, the ADW architecture will provide a set of tools for users to select data, retrieve measurement values, visualize, perform data analysis and even create their own data bundle. Some of the initial requirements gathered include the ability to filter data, apply user-developed and on-the-fly equations for conditional querying that will allow users to make a unique data product for their needs. Other ARM initiatives such as data epoch, data tagging are expected to leverage this architecture.
Relevant background details, posters, and abstracts are attached in this ENG. This concept is also included in the latest ARM decadal vision, and this ENG will be the first step addressing this activity.
It is important to note that the architecture will have to be built with a goal of leveraging relevant tools and software that ARM staff are currently building. For example, the workbench is expected to utilize the parallel processing frameworks, NOSQL data stores, visualization and analysis software published on the ACT. This architecture will also integrate the current ARM Computing Environment (ACE). The workbench will be made available on both the ADC resources and ARM HPC clusters.
The primary scope of this ENG is to identify the potential users and stakeholders, and collect the requirements.
This ENG will have the following tasks:
- Identify remaining user groups and stakeholders and gather requirements
- Review and finalize the requirements
- Evaluate current capabilities and develop preliminary wireframes/prototypes for demonstrating the concept
- Determine the effort needed to leverage ACT, ACE, and other related tools as part of the overall ADW architecture
The implementation and design reviews will be tracked in a separate ENG with the actual implementation is expected to begin in early FY22.