How difficult is AI alignment? | Anthropic Research Salon

At an Anthropic Research Salon event in San Francisco, four of our researchers—Alex Tamkin, Jan Leike, Amanda Askell and Josh Batson—discussed alignment science, interpretability, and the future of AI research. Further reading: Anthropic’s research: https://anthropic.com/research Claude’s character: https://www.anthropic.com/news/claude... Evaluating feature steering: https://www.anthropic.com/research/ev... 0:00 Introduction 0:30 An overview of alignment 4:48 Challenges of scaling 8:08 Role of interpretability 12:02 How models can help 14:31 Signs of whether alignment is easy or hard 18:28 Q&A — Multi-agent deliberation 20:38 Q&A — Model alignment epiphenomenon 23:43 Q&A — What solving alignment could look like

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G

The French Do Not Care About Work

The French Do Not Care About Work

Eliezer Yudkowsky – AI Alignment: Why It's Hard, and Where to Start

Eliezer Yudkowsky – AI Alignment: Why It's Hard, and Where to Start

Can You Teach Claude to be ‘Good’? | Meet Anthropic Philosopher Amanda Askell

Can You Teach Claude to be ‘Good’? | Meet Anthropic Philosopher Amanda Askell

Something is jamming GPS over Europe. Here's what we found

Something is jamming GPS over Europe. Here's what we found

Alignment faking in large language models

Alignment faking in large language models

Anthropic CEO warns that without guardrails, AI could be on dangerous path

Anthropic CEO warns that without guardrails, AI could be on dangerous path

WWDC 2026 Impressions: Yeah, That's About Right

WWDC 2026 Impressions: Yeah, That's About Right

Infantino stinksauer, leere Ränge, Buh-Rufe - und 200.000 Tickets übrig! RIP Fußball WM 2026

Infantino stinksauer, leere Ränge, Buh-Rufe - und 200.000 Tickets übrig! RIP Fußball WM 2026

Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next

Anthropic's Boris Cherny: Why Coding Is Solved, and What Comes Next

Is Lilly from THE PRINCESS DIARIES a Toxic Friend?

Is Lilly from THE PRINCESS DIARIES a Toxic Friend?

Scaling Laws: Claude's Constitution, with Amanda Askell

Scaling Laws: Claude's Constitution, with Amanda Askell

Could AI models be conscious?

Could AI models be conscious?

Building Anthropic | A conversation with our co-founders

Building Anthropic | A conversation with our co-founders

What is Al "reward hacking"—and why do we worry about it?

What is Al "reward hacking"—and why do we worry about it?

Anthopic, OpenAI Should Not Be Allowed to IPO, Says Ed Zitron

Anthopic, OpenAI Should Not Be Allowed to IPO, Says Ed Zitron

Anthropic’s philosopher answers your questions

Anthropic’s philosopher answers your questions

Anne Applebaum and Fiona Hill: Why America Is Losing Its Edge

Anne Applebaum and Fiona Hill: Why America Is Losing Its Edge