Operating End-to-End HPC Data Center Observability at Scale

Presenters: Melissa Romanus and Basil Lalli (NERSC/LBNL). This tech talk focused on Omni, NERSC's sophisticated "all-seeing eye" telemetry architecture. As NERSC prepares for its next-generation system, Doudna, the team shared how they manage millions of metrics and logs per second across a diverse, multi-vendor environment while maintaining high availability and deep historical archives.