ML Foundations (prerequisites) for Post-Training | RLHF Book Course, Lecture 0
In this video I try to cover a bunch of math, LLM training fundamentals, and probability concepts that come up again and again in post-training content (and this book). We cover things like the role of mid-training, definitions of KL, entropy & cross-entropy, getting LM probabilities from a sequence, etc. Thanks to everyone who nudged me to make this video, the slides were a fun experiment with GLM-5.2 (more on that model here: https://www.interconnects.ai/p/glm-52...) Extra learning resources: https://rlhfbook.com/course#extra-res... "Lecture 0" added after the course was well underway :) Welcome to The RLHF Book & Post-Training Course with Nathan Lambert. Ask questions and I'll answer them in the next roundup video! Slides for this lecture are here: https://rlhfbook.com/teach/course/lec... Chapters: 00:00 Introduction & Course Prerequisites 01:37 Language Models Overview 02:47 The LM Head 04:29 Softmax & Log-Probabilities 06:13 Anatomy of an LM Training Example 06:37 Computing LLM Probabilities (+Phoebe the Dog) 09:52 Three Common Masks in Post-Training 11:03 A Small Decoding Review 12:14 Training an LM: Cross-Entropy 13:23 Optimization & Fine-Tuning 13:55 Pretraining to Midtraining to SFT Pipeline 15:25 Probability Essentials: KL Divergence & Entropy 19:36 Sigmoid & Pairwise Likelihood 20:29 Reinforcement Learning Framing (MDP) 22:28 Transitioning Tools into Post-Training 23:12 Recommended Resources & Wrap-Up All resources will be available at https://rlhfbook.com/ Order a copy of the book (physical recommended) on Manning.com: https://hubs.la/Q03Tc3dc0 Order a copy on Amazon: https://amzn.to/4cwCDJQ With specific course resources at https://rlhfbook.com/course (recording links, slides in PDF and native form, etc.) And code at https://rlhfbook.com/code Get more information on Nathan at http://natolambert.com/ and stay up to date with his work on Interconnects https://www.interconnects.ai/ Course YouTube playlist: • Welcome to The RLHF Book & Post-Training C... Join the book's Discord Community: / discord Nathan is on… X: / natolambert LinkedIn: / natolambert GitHub: https://github.com/natolambert BlueSky: https://bsky.app/profile/natolambert.... Threads: https://www.threads.com/@natolambert Substack: https://substack.com/@natolambert Slides are built with Colloquium: https://github.com/natolambert/colloq... Thank you to my many collaborators who helped me learn this information I get to share with the world!

RLHF and Post-training Overview | RLHF & Post-Training Book Course, Lecture 1

On-Policy Distillation & Using Synthetic Data in Post-Training | RLHF Book Course, Lecture 7

Young Men in Expensive Cars

AI Software Development Is Near-Impossible

RLHF Foundations, IFT, Reward Modeling, Rejection Sampling | RLHF & Post-Training Course Lecture 2

What Nobody Tells You About Being a Quant

Reinventing Entropy | Compression is Intelligence Part 1
![Yann LeCun's $1B Bet Against LLMs [Part 1]](https://i.ytimg.com/vi/kYkIdXwW2AE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDbV4izF3i-wxevCVIn7FJjoy1vlA)
Yann LeCun's $1B Bet Against LLMs [Part 1]

Understanding Policy Gradient Algorithms for RL on LLMs | RLHF & Post-training Course Lecture 3

Designing Math ft. Grant Sanderson (3Blue1Brown) I Config 2026

How To Think SO CLEARLY People Assume You're A Genius

Direct Preference Optimization (DPO) and Friends | RLHF & Post-training Course, Lecture 6

"Software Fundamentals Matter More Than Ever" — Matt Pocock

Terence Tao: Nobody Understands Why AI Actually Works
![[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han](https://i.ytimg.com/vi/OkEGJ5G3foU/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDALOTyyIB7iZX9LiUj82NSPuT6Hw)
[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!

The Strange Math That Predicts (Almost) Anything

Is RAG Still Needed? Choosing the Best Approach for LLMs

The Rise of Reasoning Models | RLHF & Post-training Course Lecture 5

