What Matters Right Now In Mechanistic Interpretability?

This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed? What are the most important things to work on? What are common mistakes? How do I recommend approaching the field? Notes: https://docs.google.com/document/d/1d... 00:00:00 What Matters Now in Mech Interp 00:02:16 The Rise of Reasoning Models 00:04:49 Q&A: Architecture Changes 00:08:58 Research Priorities 00:10:13 AI Psychology & Misalignment 00:12:05 Debugging Model Failures 00:14:53 Understanding vs. Control 00:24:07 Case Study: Golden Gate Claude 00:30:00 A Pragmatic Research Philosophy 00:35:03 Pragmatism, Baselines, and Simplicity 00:46:12 Diagnosis vs. Solution 00:53:55 Do We Need Precision?