Buck Shlegeris - Opening session

Buck Shlegeris (Redwood Research) opens ControlConf 2026 with a working definition of AI control as mitigating risk from goal-directed misalignment given that it has already arisen, asking of any safety technique whether it would still work if the model were actively trying to subvert it. He surveys what has changed in a year: defer-to-resample, a technique Redwood proposed speculatively, now ships in Claude Code's auto mode; OpenAI reports chain-of-thought monitors covering roughly 99.9% of internal agentic traffic; sabotage evaluations like SHADE-Arena have made it into system cards; and external red teaming of company control systems has begun in earnest. He then maps where control actually bites in the AI Futures trajectory, arguing it matters most between superhuman coder and ASI, and that its core target is early scheming risk where a model establishes a persistent rogue deployment, sabotages its successor's alignment, and the situation compounds from there. The harder section is his open questions for the field, where he is explicit that he is uncertain. How much to prioritize myopic interventions like better sandboxing versus techniques that only matter later, especially given his guess that chain-of-thought monitoring will be useless by the point of automated AI R&D. Whether the field is overinvesting in pushing out the safety-usefulness Pareto frontier when the binding constraint may be safety budget, the share of company resources leadership will spend on safety at all. And whether the evaluability of control methods to outside reviewers should weigh as heavily as their actual effectiveness, given the pull toward safety theater as stakes grow. He closes by saying he thinks the field, himself included, is largely running on momentum from ideas that seemed good two years ago, and that reconsidering priorities now may matter more than any single research bet. Note: The opinions shared in this event are those of the speaker(s) and may not represent the views of FAR.AI or their affiliated organizations.

Aditya Singh - Model incrimination

Aditya Singh - Model incrimination

Andrew Ilyas - How worried should we be about data poisoning?

Andrew Ilyas - How worried should we be about data poisoning?

Buck Shlegeris - Can we use permissions management to mitigate our threats? How much novel security

Buck Shlegeris - Can we use permissions management to mitigate our threats? How much novel security

Jason Meller: Rails, Security, and the AI Advantage

Jason Meller: Rails, Security, and the AI Advantage

Begin Proof — Noam Brown

Begin Proof — Noam Brown

I Think They Are Lying To You

I Think They Are Lying To You

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

The AI Skills Nobody is Teaching (And Everyone Needs) | AI Expert Ethan Mollick

The AI Skills Nobody is Teaching (And Everyone Needs) | AI Expert Ethan Mollick

Software engineering at the tipping point

Software engineering at the tipping point

Stop Prompting Claude. Use Karpathy's Method Instead.

Stop Prompting Claude. Use Karpathy's Method Instead.

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Alex Mallen - Control for non-schemers

Alex Mallen - Control for non-schemers

Sundar Pichai on A.I. Backlash, the Future of Work and Google’s Next Era

Sundar Pichai on A.I. Backlash, the Future of Work and Google’s Next Era

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Terence Tao: Nobody Understands Why AI Actually Works

Terence Tao: Nobody Understands Why AI Actually Works

Now more than ever: building reliable software in the age of agents | Ron Minsky | Bug Bash 2026

Now more than ever: building reliable software in the age of agents | Ron Minsky | Bug Bash 2026

Bing-Jyue Chen - Efficient Zero-Knowledge Proofs for AI Inference

Bing-Jyue Chen - Efficient Zero-Knowledge Proofs for AI Inference

I Made Opus 4.8 and Fable 5 Build the Same App (RAW RESULTS)

I Made Opus 4.8 and Fable 5 Build the Same App (RAW RESULTS)

/handoff is my new favourite skill

/handoff is my new favourite skill