A (free) 200 Page LLM Training Playbook, epic OCR models, SAM 3? | AI & ML Monthly

🤗 New ZTM Hugging Face Bootcamp course link - https://dbourke.link/ZTMHuggingFace Welcome to machine learning & AI monthly for October 2025. This is the video version of the newsletter I write every month which covers the latest and greatest (but not always the latest) in the world of AI and ML. Read the issues online: AI/ML Monthly October 2025 (this video) — https://zerotomastery.io/blog/ai-and-... AI/ML Monthly September 2025 — https://zerotomastery.io/blog/ai-and-... My links: Download Nutrify (my startup) - https://nutrify.app Download KeepTrack (my other startup) - https://keeptrack.app Personal website - https://www.mrdbourke.com My ML blog - https://learnml.io Read my novel Charlie Walks - https://www.charliewalks.com Courses I teach: Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn Hugging Face - https://dbourke.link/ZTMHuggingFace Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch Timestamps: 0:00 - Intro 0:30 - My Work 0:42 - New course: Hugging Face Bootcamp - https://dbourke.link/ZTMHuggingFace 3:15 - Note: Skipping November 2025 (getting married) 3:45 - From the Internet 4:02 - Smol Training Playbook (200+ page LLM guide) - https://huggingface.co/spaces/Hugging... 7:11 - Hugging Face Hub v1.0 - https://huggingface.co/blog/huggingfa... 8:41 - Streaming datasets upgrade - https://huggingface.co/blog/streaming... 11:54 - Exo: DGX Spark + Mac Studio for LLM latency - https://blog.exolabs.net/nvidia-dgx-s... 20:27 - Visualizing how VLMs work - https://huggingface.co/blog/not-lain/... 24:18 - NanoVLM - https://github.com/yuvalkirstain/NanoVLM 24:47 - Case study: Basketball player tracking - https://blog.roboflow.com/identify-ba... 31:29 - When LoRA matches full fine-tuning - https://thinkingmachines.ai/blog/lora/ 35:30 - Apple Foundation Model Adapters - https://developer.apple.com/apple-int... 41:00 - Daniel's Open-source AI of the month 41:05 - RF-DETR Seg - https://blog.roboflow.com/rf-detr-seg... 42:43 - ModernVBERT (250M vision-language retriever) - https://huggingface.co/blog/paultltc/... 44:45 - Rex-Omni (10 CV tasks in one model) - https://github.com/IDEA-Research/Rex-... 51:04 - A paradigm shift for OCR 56:52 - olmOCR-2 - https://huggingface.co/allenai/olmOCR... 57:57 - olmOCR-Bench - https://huggingface.co/datasets/allen... 1:02:19 - LightOnOCR-1B - https://huggingface.co/blog/lightonai... 1:02:39 - PaddleOCR-VL-0.9B - https://huggingface.co/PaddlePaddle/P... 1:02:57 - Chandra (9B OCR model) - https://huggingface.co/datalab-to/cha... 1:03:16 - Nanonets-OCR2-3B - https://huggingface.co/nanonets/Nanon... 1:03:34 - DeepSeek-OCR - https://huggingface.co/deepseek-ai/De... 1:03:44 - DeepSeek-OCR video breakdown -    • DeepSeek OCR - More than OCR   1:04:39 - Open-source VLMs 1:04:49 - Apriel-1.5-15B-Thinker - https://huggingface.co/ServiceNow-AI/... 1:05:31 - Qwen3-VL family - https://huggingface.co/collections/Qw... 1:06:21- LlaVA-OneVision-1.5 - https://huggingface.co/collections/lm... 1:07:35 - Bee-8B-SFT - https://huggingface.co/Open-Bee/Bee-8... 1:07:40 - Sa2VA updates - https://huggingface.co/collections/By... 1:08:32 - SANSA (SAM2 few-shot segmentation) - https://github.com/ClaudiaCuttano/SAN... 1:09:35 - Small LLMs are getting better 1:09:45- Granite 4.0 & Nano - https://huggingface.co/collections/ib... 1:12:30 - MobileLLM-Pro - https://huggingface.co/collections/fa... 1:13:03 - A couple of cool things 1:13:04 - Open Code - https://github.com/sst/opencode 1:13:30 - handy.computer (speech-to-text) - http://handy.computer 1:14:04 - Emu-3.5 (image generation) - https://github.com/baaivision/Emu3.5 1:14:40 - OpenAI safeguard models - https://huggingface.co/openai/gpt-oss... 1:15:28 - RICE-ViT - https://huggingface.co/DeepGlint-AI/r... 1:16:34 - Research 1:16:35 - SAM 3 paper - https://openreview.net/forum?id=r35cl... 1:18:54 - Talks 1:18:55 - Andrei Karpathy on Dwarkesh podcast -    • Andrej Karpathy — “We’re summoning ghosts,...   1:19:20 - Outro

Yann LeCun's $1B Bet Against LLMs
▶︎

Yann LeCun's $1B Bet Against LLMs

Why Chinese AI Is Suddenly So Good (ft. DeepSeek, SeeDance 2.0) | AB Explained
▶︎

Why Chinese AI Is Suddenly So Good (ft. DeepSeek, SeeDance 2.0) | AB Explained

THE FACTORY SETTINGS THEY HID FROM YOU ARE SLOWLY DESTROYING YOUR VEHICLE
▶︎

THE FACTORY SETTINGS THEY HID FROM YOU ARE SLOWLY DESTROYING YOUR VEHICLE

The AI obsession is backfiring
▶︎

The AI obsession is backfiring

AI Bubble: How AI's push towards IPOs became a death drive | Ed Zitron
▶︎

AI Bubble: How AI's push towards IPOs became a death drive | Ed Zitron

The Best Open-source OCR model | AI & ML Monthly
▶︎

The Best Open-source OCR model | AI & ML Monthly

Most devs don't understand how LLM tokens work
▶︎

Most devs don't understand how LLM tokens work

How Meta Went From Open Source Hero to AI's Biggest Villain
▶︎

How Meta Went From Open Source Hero to AI's Biggest Villain

I Spent 6 Months Building a VICTORIAN OFFICE
▶︎

I Spent 6 Months Building a VICTORIAN OFFICE

The Best Local Agentic Coding Workflow (Complete Guide)
▶︎

The Best Local Agentic Coding Workflow (Complete Guide)

What is SonarQube | Introduction SonarQube | SonarQube Tutorial | SonarQube Basics | Intellipaat
▶︎

What is SonarQube | Introduction SonarQube | SonarQube Tutorial | SonarQube Basics | Intellipaat

Something is jamming GPS over Europe. Here's what we found
▶︎

Something is jamming GPS over Europe. Here's what we found

A recipe for 50x faster local LLM inference | AI & ML Monthly
▶︎

A recipe for 50x faster local LLM inference | AI & ML Monthly

Web Scraping Using Python For Beginners and File Handling in Python | Python Web Scraping
▶︎

Web Scraping Using Python For Beginners and File Handling in Python | Python Web Scraping

Grok's New Low Censorship AI Video Model Shouldn't Exist Yet (Most Dangerous AI News)
▶︎

Grok's New Low Censorship AI Video Model Shouldn't Exist Yet (Most Dangerous AI News)

Why AI Agents are either the best or worst thing we’ve ever built
▶︎

Why AI Agents are either the best or worst thing we’ve ever built

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
▶︎

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Visualizing transformers and attention | Talk for TNG Big Tech Day '24
▶︎

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Don't learn AI Agents without Learning these Fundamentals
▶︎

Don't learn AI Agents without Learning these Fundamentals

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source
▶︎

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source