How does DeepSeek actually work? | Full technical review
In this video, we dive into the technical innovations behind DeepSeek-R1: scaling with compute (Reasoning-Oriented Reinforcement Learning, Chain-of-Thought, GRPO, Distillation). While knowledge about AI is helpful, general software engineers should still get great value out of it. More LLM tech deep dives: • Discover How LLMs Work by Dissecting Llama Blogpost version of this video: https://juliaturc.substack.com/p/deep... 00:00 Intro 00:57 Scaling using compute instead of data 02:18 Overview of LLM training 03:20 Training DeepSeek 04:09 Reasoning-Oriented Reinforcement Learning 06:45 DeepSeek-R1-Zero 08:00 Back to training DeepSeek 10:29 Chain-of-Thought 12:19 GRPO 13:46 Distillation 14:45 Outro Correction: The cited $6M cost was incurred by DeepSeek-V3, not DeepSeek-R1. The cost for the latter is unknown (Source: https://www.reuters.com/technology/ar...)

How is hardware reshaping LLM design?

The insane engineering of Deepseek V4

AI Engineering in 75 Minutes - Foundation Models, Evaluation, RAG, Agents, Finetuning & Inference

DeepSeek's GRPO (Group Relative Policy Optimization) | Reinforcement Learning for LLMs

Deep Dive into LLMs like ChatGPT

How LLMs survive in low precision | Quantization Fundamentals

DeepSeek is a Game Changer for AI - Computerphile

What is DeepSeek? AI Model Basics Explained

Why are diffusion LLMs so fast?

How Did They Do It? DeepSeek V3 and R1 Explained

CLAUDE CODE ADVANCED FULL COURSE (3 HOURS)
![Yann LeCun's $1B Bet Against LLMs [Part 1]](https://i.ytimg.com/vi/kYkIdXwW2AE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDbV4izF3i-wxevCVIn7FJjoy1vlA)
Yann LeCun's $1B Bet Against LLMs [Part 1]

My Honest Thoughts about Deepseek

DeepSeek R1 Theory Overview | GRPO + RL + SFT

Training models with only 4 bits | Fully-Quantized Training
![How DeepSeek Rewrote the Transformer [MLA]](https://i.ytimg.com/vi/0VLAoVGf_74/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLCSwSaI6q3w2_zizcjVK5wONqMqIQ)
How DeepSeek Rewrote the Transformer [MLA]

I Tested DeepSeek V4 vs Opus 4.7 vs GPT 5.5

Knowledge Distillation: How LLMs train each other

How to Actually Build Mobile Apps with AI in 2026 | A Complete Beginner's Tutorial

