ML Foundations (prerequisites) for Post-Training | RLHF Book Course, Lecture 0

In this video I try to cover a bunch of math, LLM training fundamentals, and probability concepts that come up again and again in post-training content (and this book). We cover things like the role of mid-training, definitions of KL, entropy & cross-entropy, getting LM probabilities from a sequence, etc. Thanks to everyone who nudged me to make this video, the slides were a fun experiment with GLM-5.2 (more on that model here: https://www.interconnects.ai/p/glm-52...) Extra learning resources: https://rlhfbook.com/course#extra-res... "Lecture 0" added after the course was well underway :) Welcome to The RLHF Book & Post-Training Course with Nathan Lambert. Ask questions and I'll answer them in the next roundup video! Slides for this lecture are here: https://rlhfbook.com/teach/course/lec... Chapters: 00:00 Introduction & Course Prerequisites 01:37 Language Models Overview 02:47 The LM Head 04:29 Softmax & Log-Probabilities 06:13 Anatomy of an LM Training Example 06:37 Computing LLM Probabilities (+Phoebe the Dog) 09:52 Three Common Masks in Post-Training 11:03 A Small Decoding Review 12:14 Training an LM: Cross-Entropy 13:23 Optimization & Fine-Tuning 13:55 Pretraining to Midtraining to SFT Pipeline 15:25 Probability Essentials: KL Divergence & Entropy 19:36 Sigmoid & Pairwise Likelihood 20:29 Reinforcement Learning Framing (MDP) 22:28 Transitioning Tools into Post-Training 23:12 Recommended Resources & Wrap-Up All resources will be available at https://rlhfbook.com/ Order a copy of the book (physical recommended) on Manning.com: https://hubs.la/Q03Tc3dc0 Order a copy on Amazon: https://amzn.to/4cwCDJQ With specific course resources at https://rlhfbook.com/course (recording links, slides in PDF and native form, etc.) And code at https://rlhfbook.com/code Get more information on Nathan at http://natolambert.com/ and stay up to date with his work on Interconnects https://www.interconnects.ai/ Course YouTube playlist:    • Welcome to The RLHF Book & Post-Training C...   Join the book's Discord Community:   / discord   Nathan is on… X:   / natolambert   LinkedIn:   / natolambert   GitHub: https://github.com/natolambert BlueSky: https://bsky.app/profile/natolambert.... Threads: https://www.threads.com/@natolambert Substack: https://substack.com/@natolambert Slides are built with Colloquium: https://github.com/natolambert/colloq... Thank you to my many collaborators who helped me learn this information I get to share with the world!