Advanced LLM Post-Training: SFT, DPO, Reinforcement Learning w/ Maxime Labonne (Liquid AI)

In this exclusive guest lecture for the Youth AI Initiative, we hosted Maxime Labonne (Head of Post-Training at Liquid AI & Author of the LLM Engineer's Handbook) for a masterclass on the modern Large Language Model (LLM) training pipeline. Maxime went far beyond the basics; breaking down the exact techniques used by top labs to turn base models into powerful, aligned products. He covered the full stack, from dataset creation to cutting-edge techniques like GRPO. 🚀 What You Will Learn: 📌 Supervised Fine-Tuning (SFT) How to structure instruction data and teach models to follow specific commands. 📌 Preference Alignment (DPO) A deep dive into Direct Preference Optimization and how it aligns models. 📌 Reinforcement Learning (GRPO) & Reasoning Models How new “reasoning models” (like DeepSeek-R1) use Group Relative Policy Optimization (GRPO) to “think,” plan, and verify their chain of thought before answering. 📌 Efficient Training Techniques A comparison of LoRA, QLoRA, and Full Fine-Tuning, including how to train models on limited hardware. 📌 Dataset Curation What makes a dataset truly “good”: accuracy, diversity, complexity, filtering, and balancing. 🌟 About the Youth AI Initiative The Youth AI Initiative is a free, 6-week intensive AI incubator for the brightest high school students. We bridge the gap between academic theory and real-world application through expert-led curriculum and guest insights from leaders at Microsoft, Hugging Face, Liquid AI, and more. 🌐 Learn More: https://youthaiinitiative.com/ 📣 Connect With Us: LinkedIn: / youth-ai-initiative Instagram: / youth_ai_initiative Twitter: https://x.com/YouthAIInit 🙏 Special Thanks A huge thank you to our main sponsor Tam Finans and our community supporter Global Turks AI. 📍 Timestamps: Introduction 00:00 What is Post-Training 00:53 Supervised Fine-Tuning 05:04 Preference Alignment (DPO) 14:00 Reinforcement Learning (GRPO) 18:10 Conclusion 24:30 Q&A 26:12

913: LLM Pre-Training and Post-Training 101 — with Julien Launay

913: LLM Pre-Training and Post-Training 101 — with Julien Launay

Introduction to LLM Post Training by Maxime Labonne, PhD

Introduction to LLM Post Training by Maxime Labonne, PhD

RFT, DPO, SFT: Fine-tuning with OpenAI — Ilan Bigio, OpenAI

RFT, DPO, SFT: Fine-tuning with OpenAI — Ilan Bigio, OpenAI

VC & Startup Fundamentals & e2vc w/ Dilan Sisu (e2vc)

VC & Startup Fundamentals & e2vc w/ Dilan Sisu (e2vc)

Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

Yann LeCun: World Models: Enabling the next AI revolution

Yann LeCun: World Models: Enabling the next AI revolution

Surface Data vs. Deep Data

Surface Data vs. Deep Data

The Intelligence Advantage- Patterns of Digital Strategy using Wardley Maps

The Intelligence Advantage- Patterns of Digital Strategy using Wardley Maps

MCP vs API: Simplifying AI Agent Integration with External Data

MCP vs API: Simplifying AI Agent Integration with External Data

Frequency Of God 963 Hz ✨ Attract Miracles, Divine Blessings & Deep Inner Peace In Your Life

Frequency Of God 963 Hz ✨ Attract Miracles, Divine Blessings & Deep Inner Peace In Your Life

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Something is jamming GPS over Europe. Here's what we found

Something is jamming GPS over Europe. Here's what we found

God Says:"GET READY — ONLY I CAN STOP WHAT IS COMING"/God Message Now/God Message

God Says:"GET READY — ONLY I CAN STOP WHAT IS COMING"/God Message Now/God Message

MAMBA and State Space Models explained | SSM explained

MAMBA and State Space Models explained | SSM explained

EfficientML.ai Lecture 14 - LLM Post-Training (MIT 6.5940, Fall 2024, Zoom Recording)

EfficientML.ai Lecture 14 - LLM Post-Training (MIT 6.5940, Fall 2024, Zoom Recording)

스페이스X 역사적 상장… 수혜 업종과 관련주는? [염승환 이사]

스페이스X 역사적 상장… 수혜 업종과 관련주는? [염승환 이사]

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI