Operational Data Analytics
May 23, 2023·
,


·
1 min read
Michael Ott
Kadidia Konaté
Melissa Romanus
Rachel Palumbo
Woong Shin
Torsten Wilde
Abstract
Many HPC sites around the globe have improved their monitoring capabilities
over the last couple of years significantly by leveraging established
software tools from the big data domain to collect, stream, and store
operational data at unprecedented granularity and detail. Vast amounts of
data from the different domains of HPC operations (infrastructure, system
hardware, software, applications) are now readily available to be utilized
for improving HPC operations. Consequently, the focus is now shifting
towards analyzing the data to obtain actionable knowledge for daily
operations. While dashboards remain a valuable tool for this task, there is
clearly a trend towards AI-based methods to help harvest the humongous
amounts of data. As data analytics is not necessarily the core expertise of
the data centers that operate the HPC systems, there is a unique
opportunity for collaboration with the data analytics community to identify
methods and develop tools to monitor and make use of this data treasure.
Event
Location
Congress Center Hamburg
Hamburg,
Session Overview
At ISC 2023 in Hamburg, the ODA BoF returned to pick up where SC22 left off: with sites now collecting operational data at unprecedented scale, the community’s focus had shifted to analyzing it. The session combined two short 10-minute presentations on the state of the art with an extended open discussion, structured around beginner, intermediate, and advanced content in roughly equal proportions.
Key Discussion Themes
- From dashboards to AI. Dashboards remained valuable, but AI-based methods were emerging as the scalable response to the “humongous amounts of data” operators now held.
- Collaboration across communities. Data analytics was not typically the core expertise of HPC data centers, creating a clear opening for partnership with the broader data-analytics research community.
- Operator, researcher, vendor. The session targeted HPC system operators, monitoring specialists, and data-analytics researchers together, reflecting that ODA only works when these three groups share practice.
Outcome
The BoF’s discussion of open data and standardization continued in the ODA team’s monthly meetings under the EEHPCWG and set up the data-standardization focus of the SC23 ODA BoF six months later.