The Production AI Playbook: Deploying Agents at Enterprise Scale — Sandipan Bhaumik, Databricks

A retail bank spent £85,000 over six months on a chatbot PoC that could not reach production. No one could explain why it was failing. When Sandipan Bhaumik's team got involved, they picked the model in week seven of an eight-week engagement — the first six weeks went to evaluation data, tracing infrastructure, and a measurement pipeline. Six weeks post launch, when the bank updated its interest rate policy and customer satisfaction dropped, the tracing system caught the cause: the new policy document had not been reembedded and the agent was serving stale answers. The talk covers the five pillars he built from that and similar engagements: evaluation (define success numerically before touching code), observability (trace every agent decision — European regulators require it), data foundation (agents do not forgive bad data the way humans do), multi agent orchestration patterns, and governance (47 PII breaches caught in testing before launch). The evaluation data set is a living system, not a fixed benchmark. The production incident playbook connects all five. Speaker info: / sandipanbhaumik

The Future of AI Agents with Andrew Ng | Interrupt 26

The Future of AI Agents with Andrew Ng | Interrupt 26

The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory

The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory

What AI Cannot Replace in Evaluation: The Role of Human Judgment: gLOCAL Evaluation Week 2026

What AI Cannot Replace in Evaluation: The Role of Human Judgment: gLOCAL Evaluation Week 2026

RAG is dead, right?? — Kuba Rogut, Turbopuffer

RAG is dead, right?? — Kuba Rogut, Turbopuffer

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Bret Taylor (Sierra, OpenAI): Why the future of enterprise software belongs to vertical agents

Bret Taylor (Sierra, OpenAI): Why the future of enterprise software belongs to vertical agents

Inside YC's AI Playbook

Inside YC's AI Playbook

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

"Software Fundamentals Matter More Than Ever" — Matt Pocock

"Software Fundamentals Matter More Than Ever" — Matt Pocock

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Why Eval++ Is the Next Great Compute Primitive — Sunil Pai & Matt Carey, Cloudflare

Why Eval++ Is the Next Great Compute Primitive — Sunil Pai & Matt Carey, Cloudflare

Yann LeCun: World Models: Enabling the next AI revolution

Yann LeCun: World Models: Enabling the next AI revolution

Data + AI Summit Keynote 2026 | Day 1

Data + AI Summit Keynote 2026 | Day 1

Unlock Autonomous AI Agents with auth.md, Michael Grinich | MCP Night: Agent Mode Keynote

Unlock Autonomous AI Agents with auth.md, Michael Grinich | MCP Night: Agent Mode Keynote

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Anthropic Workshop: Build Agents That Run for Hours — Ash Prabaker & Andrew Wilson

Anthropic Workshop: Build Agents That Run for Hours — Ash Prabaker & Andrew Wilson

Designing Data-intensive Applications with Martin Kleppmann

Designing Data-intensive Applications with Martin Kleppmann