Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

Tejas will be back on stage at the World's Fair next week! see https://ai.engineer/wf and use YOUTUBEPROMO for new tickets only. Join 6000 AI engineers at the "Superbowl of AI"! --- The agent hit a login page, panicked, reported success anyway, and the upvote never happened. Tejas Kumar's diagnosis: not a prompt problem. A harness problem. The demo builds a browser agent on GPT-3.5 Turbo (consciously choosing a VERY old model to show how good harness eng can improve it a lot) against Hacker News and layers in a harness without touching the prompt once. Guardrails cap iterations and compact context. A verify step reads the tool call history to catch the agent lying about what it did. A login handler watches the browser URL each loop and injects credentials programmatically when it hits the login page. By the end the cheap old model reliably logs in and upvotes the post. Speaker info: https://x.com/TejasKumar_ / tejasq https://github.com/TejasQ Timestamps: 0:00 Introduction to Tejas Kumar and AI Harnesses 1:45 Why we use harnesses: Reliability and control 3:00 Defining an agent harness from first principles 4:32 Key components of an agent harness (Tooling, Context, Guardrails) 5:59 Starting the demo: Building a browser agent 7:00 Inspecting the initial agent loop 8:12 The problem: Agent failure and hallucination 10:20 Adding guardrails and context management 11:54 Refactoring into a formal harness 13:02 Implementing a verify step to catch lies 15:36 Implementing a login handler for programmatic access 17:42 Final demonstration: Successful autonomous upvoting 18:34 Summary and the future of dynamic harnesses

Harness Engineering: How to Build Software When Humans Steer, Agents Execute — Ryan Lopopolo, OpenAI

Harness Engineering: How to Build Software When Humans Steer, Agents Execute — Ryan Lopopolo, OpenAI

The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory

The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory

The Future Is Domain-Specific Agents - Justin Schroeder, StandardAgents

The Future Is Domain-Specific Agents - Justin Schroeder, StandardAgents

Loop Engineering explained in 8min..

Loop Engineering explained in 8min..

The Non-NVIDIA AI Card Everyone’s Ignoring

The Non-NVIDIA AI Card Everyone’s Ignoring

"Software Fundamentals Matter More Than Ever" — Matt Pocock

"Software Fundamentals Matter More Than Ever" — Matt Pocock

Open Source AI Is Getting Too Big to Run

Open Source AI Is Getting Too Big to Run

Linus Torvalds: AI Can’t Think Like a Programmer

Linus Torvalds: AI Can’t Think Like a Programmer

You Can Learn AI Agent Harness & Loop Engineering In 19 Min | LLM Ops, Eval, Tracing, RAG

You Can Learn AI Agent Harness & Loop Engineering In 19 Min | LLM Ops, Eval, Tracing, RAG

Full Walkthrough: Workflow for AI Coding — Matt Pocock

Full Walkthrough: Workflow for AI Coding — Matt Pocock

What is an AI harness? I build one live in less than 30 minutes

What is an AI harness? I build one live in less than 30 minutes

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

MCP vs API Explained: Do You Really Need MCP?

MCP vs API Explained: Do You Really Need MCP?

What AI Agent Skills Are and How They Work

What AI Agent Skills Are and How They Work

Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next

Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next

Harness Engineering Masterclass: Technical Deep Dive on how to build Agentic Systems

Harness Engineering Masterclass: Technical Deep Dive on how to build Agentic Systems

Agent Harness explained in 8min..

Agent Harness explained in 8min..

L8 engineer’s Agentic Engineering Workflow

L8 engineer’s Agentic Engineering Workflow

MCP vs API: Why traditional APIs are failing AI agents

MCP vs API: Why traditional APIs are failing AI agents

Everything we knew about software has changed — Theo Browne, @t3dotgg

Everything we knew about software has changed — Theo Browne, @t3dotgg