Datadog On Caching

Caching (and cache invalidation!) is often mentioned as one of the hardest problems in computer science. While caching can bring substantial performance improvements, reasoning about cached data can be extremely difficult as caching fundamentally means that you are no longer reading from your source of truth. With that in mind, many teams at Datadog needed to build distributed caches to scale their services and keep latency low. As Datadog grew in size and complexity, teams designing and operating their own cache solutions started to become a bottleneck and added to the complexity. Based on that experience, a team was created to design, build and maintain a managed service for distributed in-memory caching, providing an easy way for over 2,000 engineers at Datadog to add fast caching to their system in a scalable, reliable, and consistent manner. In this session, Ara Pulido, Staff Developer Advocate, will chat with Mitch Ward and Jessica Cordonnier, engineering managers on the Caching team at Datadog. They will explain how they used the learnings from prior cache implementations and distributed system principles to design the caching platform at Datadog. They will cover the various components that make up the platform, including the storage system, data structures, and scaling solutions. By the end of the session you will understand caching systems better, their potential pitfalls and how to mitigate those, and how to run a cache infrastructure as an internal platform as a service. Unfortunately, we can't offer any help naming your internal caching platform; that's another difficult computer science problem for another time! 00:00 - Introduction 04:20 - Introduction to caching 10:23 - History of caching at Datadog 16:44 - Datadog's Caching team 19:45 - Designing Ephemera 26:05 - System Architecture 31:44 - Improving data persistance 35:33 - Network is hard 39:20 - Internal managed services 47:25 - Ephemera in the future 49:47 - Key takeaways 51:55 - Q&A

Distributed Systems in One Lesson by Tim Berglund
▶︎

Distributed Systems in One Lesson by Tim Berglund

Datadog on Data Engineering Pipelines: Apache Spark at Scale
▶︎

Datadog on Data Engineering Pipelines: Apache Spark at Scale

AWS AI Conclave ‘26  - Scaling Agentic AI with AWS Agent Core
▶︎

AWS AI Conclave ‘26 - Scaling Agentic AI with AWS Agent Core

Salesforce Thought They Were Microsoft...Got A Harsh Reality Check
▶︎

Salesforce Thought They Were Microsoft...Got A Harsh Reality Check

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker
▶︎

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Jfrog | Jfrog Artifactory | Jfrog Artifactory Tutorial | Artifactory Tutorial | Intellipaat
▶︎

Jfrog | Jfrog Artifactory | Jfrog Artifactory Tutorial | Artifactory Tutorial | Intellipaat

Before the Vault, Around It, and After It: The VaultSpeed Agentic Framework Webinar
▶︎

Before the Vault, Around It, and After It: The VaultSpeed Agentic Framework Webinar

Google & AWS Veteran: What Top Tier Software Architects Actually Do
▶︎

Google & AWS Veteran: What Top Tier Software Architects Actually Do

Datadog on Kubernetes Monitoring
▶︎

Datadog on Kubernetes Monitoring

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra
▶︎

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup
▶︎

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

Mastering Chaos - A Netflix Guide to Microservices
▶︎

Mastering Chaos - A Netflix Guide to Microservices

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
▶︎

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Something is jamming GPS over Europe. Here's what we found
▶︎

Something is jamming GPS over Europe. Here's what we found

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra
▶︎

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

New Jellyfish Aquarium • Healing of Stress, Anxiety and Depressive States • Goodbye Insomnia #30
▶︎

New Jellyfish Aquarium • Healing of Stress, Anxiety and Depressive States • Goodbye Insomnia #30

The Man Who Revolutionized Computer Science With Math
▶︎

The Man Who Revolutionized Computer Science With Math

Building the PERFECT Linux PC with Linus Torvalds
▶︎

Building the PERFECT Linux PC with Linus Torvalds

Cache Systems Every Developer Should Know
▶︎

Cache Systems Every Developer Should Know

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit
▶︎

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit