Introduction to Site Reliability Engineering
In this session, we start with the basics of SRE, including some common terminology and theory, then dive into practical examples—including lessons learned from our own journey here at Datadog. We discuss the relationship between SRE and DevOps, what success looks like (and how to measure it), and how to identify and nurture both internal and external talent in order to build a cross-functional team. SRE is a large, complex topic, so the session ends with a live Q&A and deep-dive into some great topics.

▶︎
Ensuring Reliability with SLOs with Datadog & Google Cloud

▶︎
What is SRE | Tasks and Responsibilities of an SRE | SRE vs DevOps

▶︎
Google & AWS Veteran: What Top Tier Software Architects Actually Do

▶︎
Site Reliability Engineering: How Google Runs… by Betsy Beyer · Audiobook preview

▶︎
Getting Started with SRE - Stephen Thorne, Google

▶︎
Software engineering at the tipping point

▶︎
Datadog on Kubernetes Monitoring

▶︎
Life of an SRE at Google - JC van Winkel - Codemotion Rome 2017

▶︎
Datadog on Site Reliability Engineering

▶︎
What is Site Reliability Engineering (SRE)?

▶︎
Begin Your SRE Career: An Intro Site Reliability Engineering and the Application Process (WEBINAR)

▶︎
What is DevOps? REALLY understand it | DevOps vs SRE

▶︎
Building a Real Time Metrics Database at Datadog

▶︎
Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

▶︎
AI, DevOps, and Kubernetes: Kelsey Hightower on What’s Next

▶︎
I Wasted 2 Years Learning DevOps Wrong. Here's What I'd Do Instead.

▶︎
Site Reliability Engineering (SRE) Fundamentals

▶︎
System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

▶︎
How to get started with SLI/SLO with Steve McGhee

▶︎
