Run Qwen 3.5/3.6 35B on 8GB VRAM | LM Studio + Opencode Setup (40 tk /s)
UPDATE: This will also work with the new Qwen 3.6 models! Waffle ends at 1:35, 1.5x speed recommended. Run large models on cheap GPUs. This model is on-par with Claude Haiku 4.5, and can run on cheap consumer hardware! Setup guide for LM Studio + Opencode. You can also use ollama. https://qwen.ai/blog?id=qwen3.5
![[RANKING] 20 Local AI Models — 8GB VRAM Tier List (Qwen3.5, Gemma 4, DeepSeek)](https://i.ytimg.com/vi/TBHl3h9-CRY/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLAUgBOMzzqlw9Xfskp9PhIgUmrBmw)
▶︎
[RANKING] 20 Local AI Models — 8GB VRAM Tier List (Qwen3.5, Gemma 4, DeepSeek)

▶︎
Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

▶︎
The Best Local Agentic Coding Workflow (Complete Guide)

▶︎
I ran Qwen 3.6 35B on 8GB of VRAM at almost 20 t/s (COMPLETE TUTORIAL using llama.cpp)

▶︎
Claude Code + LM Studio + Qwen3.5 35B A3B

▶︎
How much difference does MTP make? Qwen 3.6 27B tested - 16GB Local LLM setup

▶︎
Can Qwen Dethrone Opus 4.7, GPT 5.5 and Gemini 3.1?

▶︎
Your local LLM is 10x slower than it should be

▶︎
GLM-5.2 vs Opus 4.8: Have Open Source Models Caught Up?

▶︎
China’s Secret | The Most Unbelievable Megaprojects in China | 4K Travel Documentary

▶︎
The Local AI Hardware Mistake Everyone Makes

▶︎
BitNet b1.58 How 1.58-Bit Ternary Weights Run LLMs on CPUs Without GPUs

▶︎
How DeepSeek V4 fits on a laptop and what does it mean to us?

▶︎
Ollama vs LM Studio vs llama.cpp: Which Should You Use?

▶︎
Testing Qwen3.6 35B A3B with OpenCode on the M5 Pro

▶︎
Gemma 4 26B vs 8GB VRAM! Abliterated LM Studio Guide 🤯

▶︎
DFlash on GTX 1060: Can Dense AI Models Cheat VRAM Like MoE?

▶︎
AMD MI50 32GB for Local AI: Qwen 3.6 & Gemma 4 on llama.cpp / vLLM (vs R9700)

▶︎
Best Local Coding AI for 8GB VRAM (2026 Benchmark)

▶︎
