What Matters Right Now In Mechanistic Interpretability?

This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed? What are the most important things to work on? What are common mistakes? How do I recommend approaching the field? Notes: https://docs.google.com/document/d/1d... 00:00:00 What Matters Now in Mech Interp 00:02:16 The Rise of Reasoning Models 00:04:49 Q&A: Architecture Changes 00:08:58 Research Priorities 00:10:13 AI Psychology & Misalignment 00:12:05 Debugging Model Failures 00:14:53 Understanding vs. Control 00:24:07 Case Study: Golden Gate Claude 00:30:00 A Pragmatic Research Philosophy 00:35:03 Pragmatism, Baselines, and Simplicity 00:46:12 Diagnosis vs. Solution 00:53:55 Do We Need Precision?

What Happened With Sparse Autoencoders?

What Happened With Sparse Autoencoders?

How Will Mech Interp Help Make AGI Safe?

How Will Mech Interp Help Make AGI Safe?

Mechanistic Interpretability explained | Chris Olah and Lex Fridman

Mechanistic Interpretability explained | Chris Olah and Lex Fridman

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

The Bitter Lesson for Biology — Adam Green on Virtual Cells and Scaling Laws

The Bitter Lesson for Biology — Adam Green on Virtual Cells and Scaling Laws

Mechanistic Interpretability and How LLMs Understand

Mechanistic Interpretability and How LLMs Understand

The Story of Mech Interp

The Story of Mech Interp

Mechanistic Interpretability for NLP: One-stop Guide for Everything you Need to Know

Mechanistic Interpretability for NLP: One-stop Guide for Everything you Need to Know

How Reasoning Models Break Mechanistic Interpretability Techniques

How Reasoning Models Break Mechanistic Interpretability Techniques

How AI Learned to Teach Itself [JEPA]

How AI Learned to Teach Itself [JEPA]

Reinventing Entropy | Compression is Intelligence Part 1

Reinventing Entropy | Compression is Intelligence Part 1

The French Do Not Care About Work

The French Do Not Care About Work

The Dark Matter of AI [Mechanistic Interpretability]

The Dark Matter of AI [Mechanistic Interpretability]

Scaling interpretability

Scaling interpretability

Science of Misalignment

Science of Misalignment

Margin Call - "Sell it all. Today." 👆🤘👆

Margin Call - "Sell it all. Today." 👆🤘👆

Why you should care about AI interpretability - Mark Bissell, Goodfire AI

Why you should care about AI interpretability - Mark Bissell, Goodfire AI

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

If Prime Numbers Become Increasingly Rare, Then Why Do They Keep Showing Up In Pairs?

If Prime Numbers Become Increasingly Rare, Then Why Do They Keep Showing Up In Pairs?

Ilya Sutskever – We're moving from the age of scaling to the age of research

Ilya Sutskever – We're moving from the age of scaling to the age of research