Can Reinforcement Learning Lead to AGI? - Daniel Han, Unsloth

Can Reinforcement Learning Lead to AGI? - Daniel Han, Unsloth With the release of reasoning models like the O series of models, DeepSeek R1, Gemini and others, reinforcement learning has come back in full force. The question is whether we can scale RL to infinity in the limit to reach "AGI", and are RL algorithms actually learning new knowledge or are they simply accentuating knowledge in pretrained models? We will try to address these questions, and provide predictions on where RL is heading.

How AI Cracked the Protein Folding Code and Won a Nobel Prize
▶︎

How AI Cracked the Protein Folding Code and Won a Nobel Prize

Getting Started with Inference Using vLLM
▶︎

Getting Started with Inference Using vLLM

[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han
▶︎

[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026
▶︎

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

What 20 years of kernel bugs taught us about finding the next one | Jenny Qu | Bug Bash 2026
▶︎

What 20 years of kernel bugs taught us about finding the next one | Jenny Qu | Bug Bash 2026

Yann LeCun: World Models: Enabling the next AI revolution
▶︎

Yann LeCun: World Models: Enabling the next AI revolution

How To Think SO CLEARLY People Assume You're A Genius
▶︎

How To Think SO CLEARLY People Assume You're A Genius

Using Large Language Models | Build Your Own LLM Workshop #1
▶︎

Using Large Language Models | Build Your Own LLM Workshop #1

Yann LeCun's $1B Bet Against LLMs [Part 1]
▶︎

Yann LeCun's $1B Bet Against LLMs [Part 1]

The Limits of AI: Generative AI, NLP, AGI, & What’s Next?
▶︎

The Limits of AI: Generative AI, NLP, AGI, & What’s Next?

Interpretability: Understanding how AI models think
▶︎

Interpretability: Understanding how AI models think

torch.compile and Diffusers: A Hands-On Guide to Peak Performance - Sayak Paul, Hugging Face
▶︎

torch.compile and Diffusers: A Hands-On Guide to Peak Performance - Sayak Paul, Hugging Face

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
▶︎

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Visualizing transformers and attention | Talk for TNG Big Tech Day '24
▶︎

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

The FASTEST introduction to Reinforcement Learning on the internet
▶︎

The FASTEST introduction to Reinforcement Learning on the internet

The Strange Math That Predicts (Almost) Anything
▶︎

The Strange Math That Predicts (Almost) Anything

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang
▶︎

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

How AI agents & Claude skills work (Clearly Explained)
▶︎

How AI agents & Claude skills work (Clearly Explained)

Rethinking the Transformer: Toward Native Multimodal Architectures - Bowen Peng, Nous Research
▶︎

Rethinking the Transformer: Toward Native Multimodal Architectures - Bowen Peng, Nous Research

The Science and Practice of Open and Scalable LLM Evaluations - Grzegorz Chlebus, NVIDIA
▶︎

The Science and Practice of Open and Scalable LLM Evaluations - Grzegorz Chlebus, NVIDIA