A recipe for 50x faster local LLM inference | AI & ML Monthly
Welcome to machine learning & AI monthly for June 2025. This is the video version of the newsletter I write every month which covers the latest and greatest (but not always the latest) in the world of AI and ML. Thumbnail paper link: https://huggingface.co/papers/2506.14111 Read the issues online: AI/ML Monthly June 2025 (this video) — https://zerotomastery.io/blog/ai-and-... AI/ML Monthly May 2025 — https://zerotomastery.io/blog/ai-and-... AI/ML Monthly April 2025 — https://zerotomastery.io/blog/ai-and-... My links: Download Nutrify (my startup) - https://nutrify.app Download KeepTrack (my other startup) - https://keeptrack.app Learn Hugging Face - https://dbourke.link/ZTM-HF-Text-Clas... Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse Learn TensorFlow - https://dbourke.link/ZTMTFcourse Learn PyTorch - https://dbourke.link/ZTMPyTorch My ML blog - https://learnml.io Read my novel Charlie Walks - https://www.charliewalks.com Personal website - https://www.mrdbourke.com Timestamps: 00:00 - Intro 00:25 - ZTM Object Detection with Hugging Face Transformers Project: https://www.learnhuggingface.com/note... 01:28 - KeepTrack is now an app: https://keeptrack.app 02:15 - The case for more ambition in AI research by Jack Morris: https://blog.jxmo.io/p/the-case-for-m... 03:56 - Save money on AI audio transcriptions by speeding up the audio: https://george.mand.is/2025/06/openai... 06:16 - Answer.AI release ReadBench to test how well VLMs can read: https://www.answer.ai/posts/2025-06-0... 9:06 - Flux.1 Kontext Release: https://bfl.ai/announcements/flux-1-k... 11:22 - Gemma 3n models designed to run on local devices released in full: https://huggingface.co/blog/gemma3n 18:05 - NuExtract 2.0 for structured data extraction: https://huggingface.co/collections/nu... 19:17 - 50x faster LLM inference recipe from Essential AI: https://huggingface.co/papers/2506.14111 23:32 - Qwen3 embedding and reranker models: https://huggingface.co/collections/Qw... 24:22 - BioCLIP 2: https://huggingface.co/imageomics/bio... 29:34 - GLiNER-X series for any entity detection: https://huggingface.co/collections/kn... 26:28 - V-JEPA 2: https://github.com/facebookresearch/v... 30:58 - OCR edges towards its ChatGPT moment (Nanonets-OCR-s): https://nanonets.com/research/nanonet... 34:12 - torchvista – visualizing PyTorch model flows: https://github.com/sachinhosmani/torc... 35:22 - Ovis-U1-3B combines multimodal understanding, image generation and editing: https://huggingface.co/AIDC-AI/Ovis-U... 38:29 - Baidu release the Ernie 4.5 foundation models: https://huggingface.co/collections/ba... 39:46 - Google Colab updates (Hugging Face integration & more): / launch-hugging-face-models-in-colab-for-fa... 42:13 - Apple updates its on-device and server foundation models: https://machinelearning.apple.com/res... 49:16 - Anthropic guide on building a multi-agent research system: https://www.anthropic.com/engineering... 49:30 - Google Gemini 2.5 Pro and Flash releases: https://developers.googleblog.com/en/... 52:36 - Andrej Karpathy on Software 3.0, agents & more: • Andrej Karpathy: Software Is Changing (Again) 55:42 - Pivot to AI YouTube channel: / @pivottoai 56:04 - Nate B Jones YouTube channel: / @natebjones

The Best Open-source OCR model | AI & ML Monthly

Andrej Karpathy: Software Is Changing (Again)

Nobody Could Define 'Loop Engineering' — an Anthropic Insider Just Published the Recipe (Fable 5)

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Over 3x Faster AI. MTP Explained, Deployed & Benchmarked on Gemma 4 & Qwen 3.6.

Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

I Tried Every Major Linux Distro So You Don't Have To (Here's What I Found)

Something is jamming GPS over Europe. Here's what we found

EASIEST Way to Fine-Tune a LLM and Use It With Ollama

Fine Tuning LLM Models – Generative AI Course

Fable JUST made EVERYONE MAD...

The AI Take Over Has Completely Backfired and I Can't Be Happier

End-to-End (small) Vision Language Model Fine-tuning Tutorial | On DGX Spark

ChatGPT, Gemini, Claude & Co erklärt: Wie Maschinen Sprache verstehen | Terra X Lesch & Co

A Visual Guide to Mixture of Experts (MoE) in LLMs

What does it take to build a Realistic RAG in 2025? | AI & ML Monthly

Anthropic Just Dropped Fable 5 And It’s Terrifying

The Ultimate Guide to Local AI and AI Agents (The Future is Here)

