July 17, 2025 Minutes

Jul 17, 2025·
Natalie Bates
Natalie Bates
· 4 min read

Michael provided an update on the Birds of a Feather (BoF) session submission for the SC25 conference. The BoF proposal centers on bridging the gap between those who operate HPC machines and researchers, by making operational data analytics available for analysis. The planned session will be split 50/50 between presentations and interactive discussion, and to facilitate that discussion we hope to use Mentimeter. Last year a conference-wide ban was placed on external tools like Mentimeter, so this year we have proactively contacted the conference chairs to get clarification. While we have not yet received an official answer, we are hopeful that the chairs are leaning towards being more permissive and will allow Mentimeter to be used again. Notification for BoF submissions is August 22nd.

Michael then gave an update on the brownbag discussion, an initiative to run a webinar or meeting series where different HPC sites present on their software stacks for operational data collection. The goal is to make these discussions available to a wider audience beyond the ODA team. We may recycle recent brownbags, refresh older ones, and schedule new ones, and we still need to decide on the format, specifically whether to continue with an open discussion. A possible start date could be in late November or early December, after the SC25 conference, which would also serve as a good opportunity to promote the series. Woong clarified that the motivation behind the brownbag discussion is a way to establish and showcase the ODA community, especially after a workshop proposal was rejected. He emphasized the importance of having a web presence so that the group and its work on operational data analytics are discoverable through search engines, and he volunteered to help with the web presence and follow up on this task.

Michael described a new European-wide project called Synergies that officially started on June 1st. The project’s goal is to build a complete software stack for operational data analytics, from monitoring and analysis to influencing system schedulers to improve throughput and energy efficiency. A key part of this project is making operational data available to outside parties, including researchers and universities, to foster further research. Michael noted that this initiative faces significant challenges concerning data privacy and the anonymization of information to prevent the identification of specific applications or commercial partners. The project is funded by EuroHPC, a major European funding entity, which will likely make it a requirement for their own data centers to share similar data in the future. Additionally, the Synergies project will work on standardizing data, including naming schemes and metadata for monitoring data, a topic that has been discussed in the ODA team. At some point in the future, Michael hopes to present a first draft of this standardization work to the ODA team to get feedback.

Nil Mu from Arizona State University shared a dashboard solution developed at ASU that is a self-contained web application. It runs on Open OnDemand, an open-source graphical interface for HPC, and gets its data exclusively from the Slurm API. The dashboard displays metrics like power, memory, and allocation, and can be installed for different clusters, with a single developer having created the entire application. Nil highlighted that the dashboard is very simple and self-contained, requiring only the Slurm API and a web host. Jeff Hanson added that the Slurm REST API is a modern and powerful tool for obtaining this kind of detailed data, an improvement over older command-line tools.

Woong shared an update on a new standardization effort that stems from a “Digital Twin” Birds of a Feather (BoF) session. This new working group is focused on standardization, but not by abandoning existing efforts. Instead, it aims to create tools and scripts that reduce the manual effort of data conversion and processing. The general idea is to take use cases from the Digital Twin project and use them to develop an application-level standard for the operational data analytics (ODA) community. Woong Shin sees himself and Jeff as bridging the two communities, gathering use cases from the ExaDIGIT side and bringing them back to the ODA group to inform its standardization work.

Michael wrapped up the meeting by discussing future agenda items. He plans to dedicate the next meeting to the brownbag series, intending to meet with the sub-team beforehand to finalize a plan and invite speakers. He confirmed this plan with the team, who agreed to a preliminary meeting the following week to discuss details before bringing it back to the larger group.

Natalie Bates
Authors
Natalie Bates
EE HPC WG Technical and Executive Lead
Natalie has been the technical and executive leader for EE HPC WG that disseminates best practices, shares information (peer to peer exchange), and takes collective action since its inception in 2010.