AI Security After Codex and Claude Code — Zico Kolter & Matt Fredrikson, Gray Swan
AI agents are becoming powerful enough to write code, browse the web, access private data, and act on our behalf — but the security model for this new world is still being invented. In this episode, Gray Swan cofounders Zico Kolter and Matt Fredrikson join swyx to explain why AI security is not just “cybersecurity with AI,” why agents introduce a new class of vulnerabilities, and why the next major AI incident may be a gray swan: unlikely, but clearly visible before it happens. We go deep on prompt injection, automated red teaming, model robustness, agent identity, computer-use agents, enterprise guardrails, and the emerging AI insurance/compliance stack. Zico and Matt also explain why frontier models are not automatically safer as they scale, why specialized red-teaming models can now beat humans at breaking AI systems, and why the future of AI security may depend on AI systems attacking, defending, and interpreting other AI systems. We discuss: • Why AI systems need a different security mindset from traditional software • How prompt injection creates a new exploit class for agents like Codex and Claude Code • Gray Swan Arena and the rise of community red teaming • Shade: AI that can outperform humans at breaking models • Why LLMs are an alien form of intelligence that fail differently from humans • Human vs browser-agent robustness and why humans ranked fourth • Why eval awareness and capability elicitation matter • Cygnal: Gray Swan’s guardrail model for policy enforcement • Why bigger models do not automatically become more robust • The lethal trifecta: untrusted data, private data, and exfiltration • Why “just prompt it better” is not enough for enterprise AI security • OpenClaw, computer-use agents, and the agent security nightmare • Agent-native identity, permissions, and enterprise deployment • Why AI security may become part of insurance and compliance • Why the first major AI prompt-injection breach may be inevitable — Gray Swan • Website: https://www.grayswan.ai/ Zico Kolter • X: https://x.com/zicokolter • Website: https://zicokolter.com/ • LinkedIn: / zico-kolter-560382a4 Matt Fredrikson • Website: https://www.mattfredrikson.com/ • LinkedIn: / matt-fredrikson-7596349 Timestamps 00:00:00 Hook 00:01:07 Introduction 00:03:38 Why AI Security Is Different 00:07:45 Testing Claude, Codex, and Prompt Injection 00:08:54 Gray Swan Arena and Automated Red Teaming 00:12:21 AI That Breaks Models Better Than Humans 00:15:07 LLMs as Alien Intelligence 00:20:07 Humans vs AI Agents 00:25:42 Red Teaming, Jailbreaks, and Capability Elicitation 00:27:18 Cygnal: Guardrails for AI Agents 00:35:11 The Lethal Trifecta 00:40:38 Can AI Automate AI Research? 00:46:54 OpenClaw and the Computer-Use Security Problem 00:51:51 Agent Identity, Permissions, and Enterprise AI 00:55:31 The Future of AI Security 01:01:37 AI Insurance and Compliance 01:05:39 The Gray Swan Event Everyone Sees Coming 01:07:11 Closing Thoughts

Attacking AI - Jason Haddix - NDC Security 2026

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Logs Are All You Need: Rethinking Observability with AI Agents

Machines, Learning, and Machine Learning - Dylan Beattie - NDC Copenhagen 2026

Ed Zitron: The AI Bubble is Bleeding Cash, Here Are The Receipts

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Harnesses in AI: A Deep Dive — Tejas Kumar, IBM

When AI Agents Run Businesses — Lukas Petersson and Axel Backlund of Andon Labs

Devin’s 80% Moment: Background Agents, 7x PRs, & End of Hand-Held Coding — Walden Yan & Cole Murray

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

How To Think SO CLEARLY People Assume You're A Genius

I Made Opus 4.8 and Fable 5 Build the Same App (RAW RESULTS)

Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next

AI Was Never About Helping You | Cory Doctorow

Code Isn't Free — Mario Zechner on the Hard Truths of Coding With AI (creator of Pi)

DeepMind Was Two Steps Ahead, AGAIN!

MIT Just Revealed the AI Bubble's Fatal Flaw

AI Wants Your Life: Tech Boss Meredith Whittaker Says No | The Mishal Husain Show

You should, unfortunately, be worried about OpenAI's CEO.

