Introduction to Site Reliability Engineering

In this session, we start with the basics of SRE, including some common terminology and theory, then dive into practical examples—including lessons learned from our own journey here at Datadog. We discuss the relationship between SRE and DevOps, what success looks like (and how to measure it), and how to identify and nurture both internal and external talent in order to build a cross-functional team. SRE is a large, complex topic, so the session ends with a live Q&A and deep-dive into some great topics.

Ensuring Reliability with SLOs with Datadog & Google Cloud

Ensuring Reliability with SLOs with Datadog & Google Cloud

What is SRE | Tasks and Responsibilities of an SRE | SRE vs DevOps

What is SRE | Tasks and Responsibilities of an SRE | SRE vs DevOps

Google & AWS Veteran: What Top Tier Software Architects Actually Do

Google & AWS Veteran: What Top Tier Software Architects Actually Do

Site Reliability Engineering: How Google Runs… by Betsy Beyer · Audiobook preview

Site Reliability Engineering: How Google Runs… by Betsy Beyer · Audiobook preview

Getting Started with SRE - Stephen Thorne, Google

Getting Started with SRE - Stephen Thorne, Google

Software engineering at the tipping point

Software engineering at the tipping point

Datadog on Kubernetes Monitoring

Datadog on Kubernetes Monitoring

Life of an SRE at Google - JC van Winkel - Codemotion Rome 2017

Life of an SRE at Google - JC van Winkel - Codemotion Rome 2017

Datadog on Site Reliability Engineering

Datadog on Site Reliability Engineering

What is Site Reliability Engineering (SRE)?

What is Site Reliability Engineering (SRE)?

Begin Your SRE Career: An Intro Site Reliability Engineering and the Application Process (WEBINAR)

Begin Your SRE Career: An Intro Site Reliability Engineering and the Application Process (WEBINAR)

What is DevOps? REALLY understand it | DevOps vs SRE

What is DevOps? REALLY understand it | DevOps vs SRE

Building a Real Time Metrics Database at Datadog

Building a Real Time Metrics Database at Datadog

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

AI, DevOps, and Kubernetes: Kelsey Hightower on What’s Next

AI, DevOps, and Kubernetes: Kelsey Hightower on What’s Next

I Wasted 2 Years Learning DevOps Wrong. Here's What I'd Do Instead.

I Wasted 2 Years Learning DevOps Wrong. Here's What I'd Do Instead.

Site Reliability Engineering (SRE) Fundamentals

Site Reliability Engineering (SRE) Fundamentals

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

How to get started with SLI/SLO with Steve McGhee

How to get started with SLI/SLO with Steve McGhee

How HashiCorp Implements SRE

How HashiCorp Implements SRE