Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

The agent hit a login page, panicked, reported success anyway, and the upvote never happened. Tejas Kumar's diagnosis: not a prompt problem. A harness problem. The demo builds a browser agent on GPT-3.5 Turbo (consciously choosing a VERY old model to show how good harness eng can improve it a lot) against Hacker News and layers in a harness without touching the prompt once. Guardrails cap iterations and compact context. A verify step reads the tool call history to catch the agent lying about what it did. A login handler watches the browser URL each loop and injects credentials programmatically when it hits the login page. By the end the cheap old model reliably logs in and upvotes the post. Speaker info: https://x.com/TejasKumar_   / tejasq   https://github.com/TejasQ Timestamps: 0:00 Introduction to Tejas Kumar and AI Harnesses 1:45 Why we use harnesses: Reliability and control 3:00 Defining an agent harness from first principles 4:32 Key components of an agent harness (Tooling, Context, Guardrails) 5:59 Starting the demo: Building a browser agent 7:00 Inspecting the initial agent loop 8:12 The problem: Agent failure and hallucination 10:20 Adding guardrails and context management 11:54 Refactoring into a formal harness 13:02 Implementing a verify step to catch lies 15:36 Implementing a login handler for programmatic access 17:42 Final demonstration: Successful autonomous upvoting 18:34 Summary and the future of dynamic harnesses