Operational Data Analytics

Nov 17, 2021·
Alessio Netti
Michael Ott
Michael Ott
Rachel Palumbo
Rachel Palumbo
Torsten Wilde
Torsten Wilde
,
Keiji Yamamoto
· 1 min read
Abstract
Many HPC sites are developing and deploying systems for operational data analytics (ODA) to help them understand and optimize their HPC operations. The complexity and sophistication of those systems as well as the components of the HPC operations covered vary significantly and there is ambiguity in the terminology used. In this BoF we want to discuss the current state-of-the-practice in ODA and investigate future developments. We will introduce and leverage a new conceptual framework that establishes common terminology and scope that will help to fuel the discussion with the audience and paint a meaningful picture of ODA.
Event
Location

America’s Center Convention Complex (hybrid)

St. Louis, Missouri

Session Overview

SC21, held as the first-ever hybrid SC in St. Louis, was also the first ODA BoF to use the newly published conceptual framework as its backbone. Alessio Netti (LRZ) and co-organizers from LRZ, ORNL, HPE, and RIKEN introduced the model and structured the audience discussion around its dimensions.

The Framework

The framework was published by Netti, Shin, Ott, Wilde, and Bates as “A Conceptual Framework for HPC Operational Data Analytics” (IEEE, 2021). It combines two dimensions:

  • Scope, following the 4-Pillar Framework for Energy-Efficient HPC Data Centers (building infrastructure, system hardware, system software, applications).
  • Capability, a staged model of data analytics maturity (descriptive, diagnostic, predictive, prescriptive).

Combining the two creates a 4x4 spatial grid that sites can use to map their ODA systems and tools, comparing scope (comprehensiveness) against capability (sophistication). This gives sites a shared vocabulary for describing what they have built and for planning what to build next.

Outcome

The 4x4 framework continues to serve as the common reference for comparing ODA deployments across sites. The BoF used it to gather structured feedback from the audience about leading-edge ODA deployments, identifying trends, requirements, and common pitfalls.