An Introduction to Mechanistic Interpretability – Neel Nanda | IASEAI 2025

How can we reverse engineer what a neural network is doing? In this IASEAI ’25 session, An Introduction to Mechanistic Interpretability, Neel Nanda (Senior Research Scientist at Google DeepMind, formerly at Anthropic) provides an accessible overview of mechanistic interpretability—the study of how to understand the inner workings of neural networks. Nanda explores the progress so far, the limits of current approaches, and the field’s potential for improving AGI safety. He also examines how better interpretability tools could help evaluate the safety of current frontier systems and ensure transparency in future AI development. About IASEAI: https://www.iaseai.org Neel Nanda: https://www.neelnanda.io/ #NeelNanda #AISafety #Interpretability #MechanisticInterpretability #IASEAI

Training Sand to Think: Artificial General Intelligence & Future of Physics

Training Sand to Think: Artificial General Intelligence & Future of Physics

Hacking LLMs: An Introduction to Mechanistic Interpretability — Jenny Vega

Hacking LLMs: An Introduction to Mechanistic Interpretability — Jenny Vega

Neel Nanda – Mechanistic Interpretability: A Whirlwind Tour

Neel Nanda – Mechanistic Interpretability: A Whirlwind Tour

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G

Mechanistic Interpretability explained | Chris Olah and Lex Fridman

Mechanistic Interpretability explained | Chris Olah and Lex Fridman

What Is Understanding? – Geoffrey Hinton | IASEAI 2025

What Is Understanding? – Geoffrey Hinton | IASEAI 2025

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Why you should care about AI interpretability - Mark Bissell, Goodfire AI

Why you should care about AI interpretability - Mark Bissell, Goodfire AI

The Dark Matter of AI [Mechanistic Interpretability]

The Dark Matter of AI [Mechanistic Interpretability]

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

AI monitoring and control | IASEAI '26

AI monitoring and control | IASEAI '26

Chris Olah - Looking Inside Neural Networks with Mechanistic Interpretability

Chris Olah - Looking Inside Neural Networks with Mechanistic Interpretability

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Professor Geoffrey Hinton, “Godfather of AI”, live Q&A

Professor Geoffrey Hinton, “Godfather of AI”, live Q&A

Demis Hassabis: We're Three Quarters of the Way to AGI

Demis Hassabis: We're Three Quarters of the Way to AGI

Causal Mechanistic Interpretability (Stanford lecture 1) - Atticus Geiger

Causal Mechanistic Interpretability (Stanford lecture 1) - Atticus Geiger

Demis Hassabis: Why AGI is Bigger than the Industrial Revolution & Where Are The Bottlenecks in AI

Demis Hassabis: Why AGI is Bigger than the Industrial Revolution & Where Are The Bottlenecks in AI

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)

Teenager Disproves 4 Decades Old Belief in Computing

Teenager Disproves 4 Decades Old Belief in Computing

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think