Building and evaluating AI Agents — Sayash Kapoor, AI Snake Oil
Is 2025 the year of AI agents? Will reasoning models allow agents to solve challenging open problems? From software engineering to web task automation, it has been claimed that agents will solve challenging open problems. Unfortunately, current agents suffer from many shortcomings that reduce their utility in real-world tasks — look no further than Rabbit R1 and the Humane Pin. In this talk, we will explore how current agents fall far short of their claimed performance in the real world and understand best practices for improving agent evaluation. Learn how to avoid known pitfalls and build AI agents that actually matter. Recorded live at the Agent Engineering Session Day from the AI Engineer Summit 2025 in New York. Learn more at https://ai.engineer and purchase tickets to our next event, the AI Engineer World's Fair, in SF June 3 - 5 here: https://ti.to/software-3/ai-engineer-... Sayash Kapoor is a Senior Fellow at Mozilla, a Laurance S. Rockefeller Graduate Prize Fellow in the University Center for Human Values, and a computer science Ph.D. candidate at Princeton University's Center for Information Technology Policy. He is a coauthor of AI Snake Oil, a book that provides a critical analysis of artificial intelligence, separating the hype from the true advances. He has written for outlets like WIRED and The Wall Street Journal, and his work has been featured in The New York Times, The Atlantic, Washington Post, Bloomberg, and many others. Kapoor has been recognized with various awards, including TIME’s inaugural list of the 100 most influential people in AI.

How We Build Effective Agents: Barry Zhang, Anthropic

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

What actually turns an LLM call into an AI agent?

When millions of AI agents meet

Ryan Lopopolo - Harness Engineering: How to Build Software When Humans Steer and Agents Execute

Terry R. Schussler | HXR Conference 2026

Foundations of Neurosymbolic AI

How AI agents & Claude skills work (Clearly Explained)

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

The Agent Development Life Cycle — Zack Reneau-Wedeen, Sierra

Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Andrej Karpathy: Software Is Changing (Again)

Cybersecurity Trends in 2026: Shadow AI, Quantum & Deepfakes

The Future of AI Agents: What Will Interrupt 2027 Look Like? | Interrupt 26

How Windsurf writes 90% of your code with an Agentic IDE - Kevin Hou, Windsurf

MCP vs API: Simplifying AI Agent Integration with External Data

Architecting Agent Memory: Principles, Patterns, and Best Practices — Richmond Alake, MongoDB

Agentic Engineering: Working With AI, Not Just Using It — Brendan O'Leary

