Misha Laskin, In-context Reinforcement Learning, 22.February.2023

Apollo Research: Q & A on 'Frontier Models are Capable of In-Context Scheming', Alex & Marius Q&A.

Apollo Research: Q & A on 'Frontier Models are Capable of In-Context Scheming', Alex & Marius Q&A.

Reflection AI’s Misha Laskin on the AlphaGo Moment for LLMs | Training Data

Reflection AI’s Misha Laskin on the AlphaGo Moment for LLMs | Training Data

Overview of Version 15: Useful AI and New Core Functionality

Overview of Version 15: Useful AI and New Core Functionality

Interpretability via Symbolic Distillation

Interpretability via Symbolic Distillation

The FASTEST introduction to Reinforcement Learning on the internet

The FASTEST introduction to Reinforcement Learning on the internet

Causal AI for real-world public health decisions

Causal AI for real-world public health decisions

Jacob Andreas | What Learning Algorithm is In-Context Learning?

Jacob Andreas | What Learning Algorithm is In-Context Learning?

Reinforcement Learning Series: Overview of Methods

Reinforcement Learning Series: Overview of Methods

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Sergey Levine, Data-Driven RL in Robotics, Language, and Beyond Share, 15.Feb.2023

Sergey Levine, Data-Driven RL in Robotics, Language, and Beyond Share, 15.Feb.2023

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Peter Stone - Practical Reinforcement Learning: Lessons from 30 Years of Research - RLC 2024

Peter Stone - Practical Reinforcement Learning: Lessons from 30 Years of Research - RLC 2024

UNBOXING THE FUTURE OF HEALTHCARE (SOTA OF AI) DAY 2

UNBOXING THE FUTURE OF HEALTHCARE (SOTA OF AI) DAY 2

Transformers As Statisticians: Provable In-Context Learning With In-Context Algorithm Selection

Transformers As Statisticians: Provable In-Context Learning With In-Context Algorithm Selection

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

What Is In-Context Learning in Deep Learning?

What Is In-Context Learning in Deep Learning?

Training Sand to Think: Artificial General Intelligence & Future of Physics

Training Sand to Think: Artificial General Intelligence & Future of Physics

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning