<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>SC19 | HPC Operational Data Analytics</title><link>https://hpc-oda-org.pages.dev/tag/sc19/</link><atom:link href="https://hpc-oda-org.pages.dev/tag/sc19/index.xml" rel="self" type="application/rss+xml"/><description>SC19</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 20 Nov 2019 17:15:00 +0000</lastBuildDate><image><url>https://hpc-oda-org.pages.dev/media/logo.svg</url><title>SC19</title><link>https://hpc-oda-org.pages.dev/tag/sc19/</link></image><item><title>Operational Data Analytics</title><link>https://hpc-oda-org.pages.dev/events/2019-sc19-oda-bof/</link><pubDate>Wed, 20 Nov 2019 17:15:00 +0000</pubDate><guid>https://hpc-oda-org.pages.dev/events/2019-sc19-oda-bof/</guid><description>&lt;h2 id="session-overview">Session Overview&lt;/h2>
&lt;p>The SC19 BoF brought together HPC sites that had already deployed comprehensive high-resolution monitoring with those still planning their own rollouts. Short presentations from three leading-edge sites across the US, Asia, and Europe framed the discussion, followed by open exchange on lessons learned, use cases, and challenges.&lt;/p>
&lt;h2 id="why-this-matters">Why This Matters&lt;/h2>
&lt;p>The EE HPC WG and others had long argued that fine-grained instrumentation and monitoring of HPC systems are required to understand, control, and optimize HPC operations. By 2019, leading sites had deployed sophisticated frameworks that could collect, store, and retrieve telemetry data from thousands of devices at high resolution, covering everything from power provisioning and cooling infrastructure down to individual compute nodes and applications. The next challenge, and the focus of this BoF, was making use of that data.&lt;/p>
&lt;p>Obvious use cases included data center infrastructure performance optimization and Fault Detection and Diagnostics. More sophisticated scenarios involved feeding data back to facility control systems or batch schedulers to optimize energy performance and utilization.&lt;/p>
&lt;h2 id="organizer">Organizer&lt;/h2>
&lt;p>The BoF was organized by the EE HPC WG Operational Data Analytics team. Supporting resources and the session&amp;rsquo;s companion page at the EE HPC WG site can be found at &lt;a href="https://eehpcwg.llnl.gov/conf_sc19.html" target="_blank" rel="noopener">eehpcwg.llnl.gov/conf_sc19.html&lt;/a>.&lt;/p></description></item></channel></rss>