Datadog on Kubernetes Monitoring

With many blog posts published and talks given on the topic, it’s no secret that Datadog is running Kubernetes at scale. We currently run dozens of clusters, some of them with thousands of nodes. Additionally, we have clusters running in multiple clouds. How are we monitoring all of that, ensuring we can scale up quickly and safely? In this session Ara Pulido, Technical Evangelist, will chat with Celene Chang and Charly Fontaine - both software engineers on the Container Integrations team at Datadog. This team is responsible for deploying and running the Datadog Agent in our Kubernetes clusters. We’ll cover how we are running the Datadog Agent in our clusters, which metrics we care about, and the monitors we have set up. By the end of the session you will have new ideas and best practices on monitoring Kubernetes with Datadog that you can apply in your own environment. Links mentioned in the talk ExtendedDaemonset Github: https://github.com/DataDog/extendedda... Watermark Pod Autoscaler Github: https://github.com/DataDog/watermarkp... How to monitor Kubernetes audit logs: https://www.datadoghq.com/blog/monito... Explore Kubernetes resources with Datadog Live Containers: https://www.datadoghq.com/blog/explor... 00:00 - Intro 02:01 - Main discussion 06:19 - Kubernetes Monitoring 101 14:36 - Best practices: Agent deployment 18:28 - Best practices: Platform monitoring 29:23 - Best practices: Audit logs 31:08 - Best practices: Workload monitoring 34:36 - Best practices: Tagging 38:45 - Autoscaling 48:48 - KubeCon discussion 53:29 - Q&A