Do Not Install oMLX Before Watching This

oMLX is a new MLX server for Mac that claims it can run AI models too big for your RAM by using your SSD as overflow memory. If that actually worked, it would be a real upgrade for anyone running local AI on a Mac mini, MacBook, or Mac Studio — you could finally run a 33 GB or even a 64 GB model on a 48 GB machine without dropping ten grand on an M3 Ultra workstation. The biggest constraint on a Mac is that you can’t upgrade the RAM after you buy it. So a tool that lets you punch above your weight class is exactly what most of us want. I put oMLX through three real tests on my 48 GB M4 Mac mini, against the two tools I actually use day to day: Ollama and LM Studio. Same 31-billion parameter model loaded into all three. Same hardware. Same workload. Test A was a baseline — how much RAM does each tool reserve before you even start talking to it? Test B was the conversation test — does it stay fast across a real 5-prompt build, or does it slow down as the context grows? Test C was the headline claim — can oMLX actually run a 33 GB model that doesn’t fit in normal RAM? The conversation test was the same exact 5-prompt sequence across all three tools: (1) Architecture of a Python web scraper. (2) Write the actual code. (3) Add CAPTCHA handling. (4) Add an analytics module. (5) Refactor everything into a proper Python package with a config file and a CLI. Each prompt builds on the last, which forces the model to keep every line we’ve already written in its context. The conversation gets harder as it gets longer. That’s exactly the workload where oMLX’s SSD cache trick is supposed to pay off. 📘 FREE BESTSELLER CHATGPT PROFITS: https://jonlinks.com/gift 🔔 SUBSCRIBE TO MY CHANNEL: https://jonlinks.com/subscribe MY PROGRAMS 🐲 Coloring Book AI: https://jonlinks.com/color 🔨Top Prompts: https://jonlinks.com/101 FOLLOW ME: 💻 BLOG: https://ServeNoMaster.com 🎙️PODCAST: https://ArtificialIntelligencePod.com 📷 INSTAGRAM:   / servenomaster   💙 FACEBOOK:   / servenomaster   🐦 TWITTER:   / servenojonathan   🏢 LINKEDIN:   / servenomaster   📷 PINTEREST:   / servenomaster   TOOLS: The Software and Tools I Use Daily: https://servenomaster.com/tools In this video: 00:00 The promise oMLX is built on 01:30 How I test AI tools (real workloads, not demos) 02:30 Step 0 — the install nightmare 05:30 Test A — RAM usage on a 31B model 07:30 Test B — 5-prompt conversation, Ollama vs oMLX speed 10:00 Test C — the bigger model that crashed my Mac 12:00 The verdict + what I'm running instead 13:30 Final thoughts ——————————————————————— Affiliate links mean I earn a commission from qualifying purchases. This helps support my channel at no additional cost to you.