Misha Laskin, In-context Reinforcement Learning, 22.February.2023

Apollo Research: Q & A on 'Frontier Models are Capable of In-Context Scheming', Alex & Marius Q&A.
▶︎

Apollo Research: Q & A on 'Frontier Models are Capable of In-Context Scheming', Alex & Marius Q&A.

Reflection AI’s Misha Laskin on the AlphaGo Moment for LLMs | Training Data
▶︎

Reflection AI’s Misha Laskin on the AlphaGo Moment for LLMs | Training Data

Overview of Version 15: Useful AI and New Core Functionality
▶︎

Overview of Version 15: Useful AI and New Core Functionality

Interpretability via Symbolic Distillation
▶︎

Interpretability via Symbolic Distillation

The FASTEST introduction to Reinforcement Learning on the internet
▶︎

The FASTEST introduction to Reinforcement Learning on the internet

Causal AI for real-world public health decisions
▶︎

Causal AI for real-world public health decisions

Jacob Andreas | What Learning Algorithm is In-Context Learning?
▶︎

Jacob Andreas | What Learning Algorithm is In-Context Learning?

Reinforcement Learning Series: Overview of Methods
▶︎

Reinforcement Learning Series: Overview of Methods

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview
▶︎

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Sergey Levine, Data-Driven RL in Robotics, Language, and Beyond Share, 15.Feb.2023
▶︎

Sergey Levine, Data-Driven RL in Robotics, Language, and Beyond Share, 15.Feb.2023

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker
▶︎

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Peter Stone - Practical Reinforcement Learning: Lessons from 30 Years of Research - RLC 2024
▶︎

Peter Stone - Practical Reinforcement Learning: Lessons from 30 Years of Research - RLC 2024

UNBOXING THE FUTURE OF HEALTHCARE (SOTA OF AI) DAY 2
▶︎

UNBOXING THE FUTURE OF HEALTHCARE (SOTA OF AI) DAY 2

Transformers As Statisticians: Provable In-Context Learning With In-Context Algorithm Selection
▶︎

Transformers As Statisticians: Provable In-Context Learning With In-Context Algorithm Selection

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
▶︎

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source
▶︎

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

What Is In-Context Learning in Deep Learning?
▶︎

What Is In-Context Learning in Deep Learning?

Training Sand to Think: Artificial General Intelligence & Future of Physics
▶︎

Training Sand to Think: Artificial General Intelligence & Future of Physics

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
▶︎

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning
▶︎

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning