MacBook Pro M5 + Qwen 3.6 35B Test | LOCAL AI Vibe Code omlx with vs code
Can a fully local AI model running on a MacBook Pro M5 (32GB) actually vibe code a production-level app — no cloud, no API keys, nothing leaving your machine? I put it to the test. In this video I run Qwen 3.6 35B A3B (4-bit quantized, ~19GB, the new MoE coding model from Alibaba) through an MLX local server, wire it into VS Code with Continue.dev as the chat/agent, and try to build a real app — a platform where AI vibe-coding developers can showcase work and land clients — purely as a "newbie" vibe coder who doesn't touch the code. I also get into why I ditched Ollama and LM Studio for MLX, the context-window crashes that kept killing the model, real tokens/sec numbers, the M5's Low Power Mode situation, and the honest cost math: a ~$2,400 MacBook for local LLMs vs $20/month cloud tools like Claude Code (Opus 4.8) and GitHub Copilot. Does local win in 2026? Stick around for the final verdict. 🛠️ Gear & tools used: • MacBook Pro M5 — 32GB unified memory, 1TB SSD • Model: Qwen 3.6 35B A3B, 4-bit quantized (MLX) • MLX local server (OMLX) • VS Code + Continue.dev (local agent) • Compared against: GitHub Copilot, Claude Code (Opus 4.8 / Sonnet 4.6), LM Studio, Ollama Chapters: 0:00 Intro – can a local LLM build a full app? 0:35 The rig: MacBook Pro M5 (32GB / 1TB SSD) 0:46 Why I dropped Ollama & LM Studio for MLX 2:34 The model: Qwen 3.6 35B A3B (4-bit, 19GB) 3:09 Planning the app – a platform for vibe coders 4:33 Memory use & tokens/sec while it thinks 5:40 Why this is my last time with GitHub Copilot 6:39 Going 100% local: Continue.dev + MLX server 8:36 First errors & the context-window problem 10:29 It starts building the app 12:43 The honest math: M5 (~$2,400) vs $20/mo cloud 16:00 Real token speeds with the MLX dashboard on 22:50 M5's Low Power Mode + speed after closing the dashboard 25:05 The app finally runs (and keeps crashing) 27:14 Final verdict: is local vibe coding worth it? 29:02 What I'm testing next + subscribe New channel, zero filter — I do these experiments live and tell you straight what works. Got a model you want me to test (Gemma 4, Qwen Coder 30B, GPT-OSS 20B)? Drop it in the comments and subscribe so you don't miss it. #localllm #vibecoding #macbookprom5 #qwen3 #aicoding

Apple’s New M5 Max Changes the Local AI Story

This Is The Best Local Model Runner For Apple Silicon (oMLX)

Qwen3.6 27B Is INSANE – Is This a LOCAL Claude Opus Competitor?

How DeepSeek V4 fits on a laptop and what does it mean to us?

The Ultimate Local AI Coding Guide For 2026

Do Not Install oMLX Before Watching This

Trump Attends NBA Finals, Cries Election Fraud in California & Storms Out of Interview

Most devs don't understand how LLM tokens work

I tested 3 local AI models. The smallest one won.

COMPUTEX 2026 Was Mostly AI. But Some Stuff Was Actually Cool

TeamMercury - PolyCode Discord Stream #09 (24-05-2026)

NVIDIA didn't want me to do this

Why Google Just Gave Away Gemma 4 for Free

My M5 Max, Gemma 4, MLX LOCAL Stack. (This KILLS MODEL PROVIDERS)

Local AI Explained | Hardware, Setup and Models

🚗 BYD : The biggest SCAM of the car industry ?

Vibe Coding With Grok Build

"Something Wicked This Way Comes" — Why The AI Bubble Isn't What You Think

This is why more and more projects are leaving GitHub!

