Why Your AI Coding Agent Keeps Writing Bad Code (It's Not the Prompt)

Most AI coding problems don't start during implementation — they start in the spec. In this episode, Dex (HumanLayer) and Vaibhav Gupta (Boundary, creator of BAML) walk through a new approach for building with AI: separating product design from technical design before a single line of code gets written. Instead of treating specs as a formality, they show how detailed product discussions, technical design docs, and program design dramatically improve the quality of AI-generated code. The goal isn't to review more code — it's to make code review easier by making better decisions earlier in the process. Along the way: how HumanLayer evolved its workflow from "research, plan, implement" into separate product and technical design phases, how mockups and rendered artifacts help surface problems before implementation, and why moving cognitive burden upstream creates tighter feedback loops and more reliable outcomes. KEY TAKEAWAYS Better specifications lead to better implementations. Separating product design from technical design creates more leverage at each stage. Design docs become more valuable when treated as artifacts the model and humans both read, not one-time requirements. Giving agents a measurable definition of success lets them propose improvements over time. Rich visual mockups surface problems earlier than text-only specs. HTML and Markdown serve different purposes — one for humans, one for token-efficient model input. Moving verification earlier in the pipeline reduces the need for expensive code review later. Standardizing what gets checked (like flagging deviations from the plan) makes AI systems easier to trust. SUMMARY Dex and Vaibhav unpack how their workflows evolved past the standard research-plan-implement loop. By splitting design into separate product and technical phases, they show how AI coding agents can arrive at better outcomes with fewer surprises during implementation. Using a live redesign of the HumanLayer task page as the working example, they walk through mockups, a product spec, a technical design doc, and program-level architecture — while discussing what makes an artifact actually useful to both humans and models. The conversation also covers the tradeoffs between HTML and Markdown, why CLAUDE.md files tend to rot, and how richer, more visual representations can improve collaboration between PMs, engineers, and the model itself. By the end, the theme is consistent: the goal isn't to catch mistakes later — it's to make enough good decisions early that implementation becomes predictable and review becomes easy. LINKS Website: https://www.boundaryml.com Docs: https://docs.boundaryml.com GitHub: https://github.com/BoundaryML/baml TIMESTAMPS 00:00 Cold Open: Using AI to Build Better Product Specs 00:26 How HumanLayer's AI Workflow Has Changed 01:32 Intros & Catching Up 05:26 From Research-Plan-Implement to Product + Technical Design 07:09 Why Product Design and Technical Design Need to Be Separate 08:53 Less Clarity, More Leverage: The Three Zoom Levels 09:57 How HumanLayer's Team Already Splits This Work 12:38 Why Product Thinking Is a Different Skill Than Engineering 13:48 Applying Product Thinking to the HumanLayer Task Page 15:43 Giving AI a Way to Measure Success 17:10 Prompting for Creativity & Better Specs 20:08 Why Seeing a Mockup First Creates Buy-In 22:30 Choosing the Right Artifacts for AI Systems 24:19 HTML vs. Markdown for Human-Facing Docs 29:13 Rich Formats vs. Token Efficiency 32:35 Making This Work for Less Technical Teams 34:54 How HumanLayer Invents and Evolves New Workflows 37:00 Catching Deviations & Moving Verification Earlier 42:16 The Problem With Giant CLAUDE.md Files (and a Better Fix) 48:34 Building Workflows That Scale With AI TOPICS COVERED Product specifications, AI coding agents, HumanLayer, BAML, PRDs, technical design documents, program design, product design, agent workflows, Markdown, HTML, mockups, Claude Code, CLAUDE.md, software engineering HASHTAGS #ClaudeCode #AIEngineering #CodingAgents #AIThatWorks #SoftwareEngineering #ProductDesign #DeveloperTools #HumanLayer #BAML

Matt Pocock’s Agentic Engineering Workflow (just copy him)

Matt Pocock’s Agentic Engineering Workflow (just copy him)

Machines, Learning, and Machine Learning -

Machines, Learning, and Machine Learning -

I built a $10K/Month SaaS With Claude AI in 4 Days...

I built a $10K/Month SaaS With Claude AI in 4 Days...

Why Performance Engineering Still Requires Human Judgment | No Vibes Allowed

Why Performance Engineering Still Requires Human Judgment | No Vibes Allowed

WTF Is an "AI Agent Loop"? Genius or Hype?

WTF Is an "AI Agent Loop"? Genius or Hype?

Google Lost $2.7 Billion In Talent This Week. The Real Reason Isn't Money.

Google Lost $2.7 Billion In Talent This Week. The Real Reason Isn't Money.

I don't have time to build these things, will you?

I don't have time to build these things, will you?

Tokens can make you rich, just do this – Mario Zechner

Tokens can make you rich, just do this – Mario Zechner

10 tips to level up your ai-assisted coding - Aleksander Stensby - NDC Copenhagen 2026

10 tips to level up your ai-assisted coding - Aleksander Stensby - NDC Copenhagen 2026

Software architecture, human judgment, and AI's limits with Grady Booch

Software architecture, human judgment, and AI's limits with Grady Booch

Unsloth Studio is insane… fine-tune any AI model locally

Unsloth Studio is insane… fine-tune any AI model locally

"Software Fundamentals Matter More Than Ever" — Matt Pocock

"Software Fundamentals Matter More Than Ever" — Matt Pocock

Yann LeCun on What Comes After LLMs

Yann LeCun on What Comes After LLMs

PLANS For Fable 5: Rebuilding My /Plan Skill for Mythos Class Models

PLANS For Fable 5: Rebuilding My /Plan Skill for Mythos Class Models

I Think They Are Lying To You

I Think They Are Lying To You

OpenAI Eng & Dev Tools Founder: How Software Engineering Is Changing | Charlie Marsh

OpenAI Eng & Dev Tools Founder: How Software Engineering Is Changing | Charlie Marsh

Gemma4 12B Coder - Composer 2.5 × Fable 5 v2 vs base - 16GB Local LLM setup

Gemma4 12B Coder - Composer 2.5 × Fable 5 v2 vs base - 16GB Local LLM setup

S13 E16: Iran Peace Deal & Feral Hogs: 6/21/26: Last Week Tonight with John Oliver

S13 E16: Iran Peace Deal & Feral Hogs: 6/21/26: Last Week Tonight with John Oliver

AI Is Turning Every Company Into a Software Company

AI Is Turning Every Company Into a Software Company

Full Walkthrough: Workflow for AI Coding — Matt Pocock

Full Walkthrough: Workflow for AI Coding — Matt Pocock