Everything You Need To Know About Agent Observability — Danny Gollapalli & Zubin Koticha, Raindrop

Agent failures do not look like normal software failures. In this workshop, the Raindrop team breaks down what it actually takes to monitor production agents, from explicit signals like tool errors, latency, and cost to fuzzier signals like user frustration, refusals, task failure, and capability gaps. The session covers how to move beyond evals toward real production observability, how to use classifiers, regex, and experiments to catch regressions, and how to instrument self-diagnostics so agents can report their own failures and strange behavior. If you're running agents in production, this is a practical framework for understanding what is going wrong and how to catch it early. Speaker info: https://x.com/benhylak   / zkoticha     / joseph-daniel-gollapalli-a371a4138   Timestamps 0:14 Introduction and the problem of agent failures 1:48 Moving from evals to production monitoring 3:33 The two types of signals: explicit and implicit 4:47 Using classifier signals for observability 6:38 Leveraging regex for signal detection 7:30 Using experiments to validate improvements 9:42 Q&A session: Statistical relevance and experimental design 16:07 Introduction to self-diagnostics 20:15 Workshop: Coding agent demonstration 24:01 Live demo: Triggering and handling tool failure 30:26 Best practices for self-diagnostic implementation 32:20 Q&A: Real-world use cases and triage 40:02 Q&A: Managing fast-paced experimentation 44:21 Q&A: Trace visualization and data export

Full Walkthrough: Writing & Using Skills — Nick Nisi and Zack Proser
▶︎

Full Walkthrough: Writing & Using Skills — Nick Nisi and Zack Proser

The Agent Development Lifecycle: Build, Test, Deploy, Monitor | Interrupt 26
▶︎

The Agent Development Lifecycle: Build, Test, Deploy, Monitor | Interrupt 26

Agent Optimization with Pydantic AI: GEPA, Evals, Feedback Loops — Samuel Colvin, Pydantic
▶︎

Agent Optimization with Pydantic AI: GEPA, Evals, Feedback Loops — Samuel Colvin, Pydantic

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google
▶︎

TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google

Claude Agents Tutorial: Free 2-Hour Masterclass by Anthropic
▶︎

Claude Agents Tutorial: Free 2-Hour Masterclass by Anthropic

Demand-Driven Context: A Methodology for Coherent Knowledge Bases Through Agent Failure
▶︎

Demand-Driven Context: A Methodology for Coherent Knowledge Bases Through Agent Failure

How to Monitor, Debug, and Trust Agentic AI Systems - Observability in Agentic AI
▶︎

How to Monitor, Debug, and Trust Agentic AI Systems - Observability in Agentic AI

Give Your Agent a Computer — Nico Albanese, Vercel
▶︎

Give Your Agent a Computer — Nico Albanese, Vercel

Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic
▶︎

Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic

Getting Started with Microsoft Agent Framework: Build Practical AI Agents
▶︎

Getting Started with Microsoft Agent Framework: Build Practical AI Agents

Spec-driven Development: How AI Changed Everything (And Nothing) by Simon Martinelli @ Spring I/O 26
▶︎

Spec-driven Development: How AI Changed Everything (And Nothing) by Simon Martinelli @ Spring I/O 26

Anthropic Workshop: Build Agents That Run for Hours — Ash Prabaker & Andrew Wilson
▶︎

Anthropic Workshop: Build Agents That Run for Hours — Ash Prabaker & Andrew Wilson

Google Generative AI Leader Certification Course – Pass the Exam!
▶︎

Google Generative AI Leader Certification Course – Pass the Exam!

Unlock Autonomous AI Agents with auth.md, Michael Grinich | MCP Night: Agent Mode Keynote
▶︎

Unlock Autonomous AI Agents with auth.md, Michael Grinich | MCP Night: Agent Mode Keynote

Caching, harnesses, and advisors: Building on Claude at GitHub scale
▶︎

Caching, harnesses, and advisors: Building on Claude at GitHub scale

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM
▶︎

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory
▶︎

The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory

Agentic Engineering: Working With AI, Not Just Using It — Brendan O'Leary
▶︎

Agentic Engineering: Working With AI, Not Just Using It — Brendan O'Leary

Tool, skill, or subagent? Decomposing an agent that outgrew its prompt
▶︎

Tool, skill, or subagent? Decomposing an agent that outgrew its prompt

"Software Fundamentals Matter More Than Ever" — Matt Pocock
▶︎

"Software Fundamentals Matter More Than Ever" — Matt Pocock