Qwen 3.6 27B vs 35B-A3B: 16GB VRAM Local Test

Qwen 3.6 27B & 35B-A3B Full Review & Tests - Agentic Coding & Local AI Vision Today we dive into Qwen 3.6 mid-size models! In this video, I do a breakdown and extensive testing of the Qwen 3.6 27B (Dense) and 35B-A3B (Sparse MoE) models running entirely locally. We explore their new hybrid architecture combining Gated DeltaNet and Gated Attention, and put them through the wringer with agentic coding, creative writing, vision tasks, and tool calling. These models are not here to play games!! The 27B is even competing with Claude 4.5 Opus on SWE-bench and LiveCodeBench! What you’ll learn in this tutorial: ✅ The architectural breakdown of Qwen 3.6's MoE and Dense models (Gated DeltaNet + Gated Attention). ✅ Generating a complete, responsive dark-theme HTML/CSS/JS website from a single prompt. ✅ Coding a fully functional 3D browser car game from scratch using Three.js. ✅Testing advanced vision capabilities, including complex image-to-code UI recreation and object counting. ✅ Evaluating strict system prompt adherence and creative writing skills. ✅ Setting up and testing Web Search tool calling with Tavily inside Open WebUI. ✅ Dense PDF document reading and extracting specific quotes from heavy technical papers. And so much more!!! Tools & Models Used: Open WebUI: The ultimate frontend for running local models with tool calling capabilities. llama.cpp: For running large GGUF models efficiently locally. Tavily: Search engine API used for the web search tool integration. Unsloth GGUFs: For high-quality, optimized quantized model files. PC Specs: Gpu: Nvidia RTX 5060 Ti 16 GB : https://amzn.to/4rU7xRy Ram: 64gb 4x16gb Kingston Fury : https://amzn.to/473HoaG Model Used : Qwen3.6-27B-UD-Q4_K_XL Qwen3.6-35B-A3B-UD-Q4_K_XL (Paired with the mmproj vision file for both models for hte vision test's) Pro Tip: The Qwen 3.6 27B Dense model is an absolute powerhouse for heavy coding and production tasks it handles complex logic and pixel-perfect UI generation incredibly well! When setting up tool calling, ensure your Tavily API keys are properly configured in the Open WebUI admin panel for seamless web searching. If you found this breakdown helpful, don’t forget to Like, Subscribe, and Hit the Notification Bell for more deep dives into AI-powered coding and local LLMs! ig : / kintugk x : https://x.com/gk_kintu Contact: [email protected] Videos Mentioned : GPT 2.0 Image review : • ChatGPT Images 2.0 - Full Review and Test Hermes Agent Tutorial VIDEO : • Hermes Agent : Full Review and Test Paddle Ocr VIDEO : • PaddleOCR-VL-1.5 vs GLM-OCR: Local Test Gemma 4 VIDEO 31B vs 26BA-A4B: • Gemma 4 31B vs 26B-A4B: 16GB VRAM Local Test Timestamps: 0:00 - Intro & Model Overview 0:56 - Benchmarks (SWE-bench & vs Claude 4.5 Opus) 1:41 - Hybrid Architecture Explained (MoE vs Dense) 4:02 - Local Setup (llama.cpp & Open WebUI) 5:43 - Test 1: HTML Website Generation 8:12 - Test 2: 3D Browser Car Game (Three.js) 10:37 - Test 3: Creative Writing (Modern Fiction) 14:09 - Test 4: System Prompt Adherence (Baking AI) 15:23 - Test 5: English to Norwegian Translation 16:46 - Test 6: Dense PDF Document Reading (PaddleOCR paper) 18:37 - Test 7: Vision Tests (People, Glasses & Emoji Counting) 20:48 - Test 8: Web Search Tool Calling (Tavily) 21:50 - Test 9: Image to Code (Admin Dashboard UI) 24:36 - Final Thoughts & Outro #qwen3 #LocalAI #LLM #CodingAI #OpenWebUI #LlamaCPP #MachineLearning #AIWorkflow #AgenticAI

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

Gemma 4 vs Qwen 3.6: Which Is the Better LOCAL Coding Agent?

Gemma 4 vs Qwen 3.6: Which Is the Better LOCAL Coding Agent?

What Is MCP? Model Context Protocol Explained Simply

What Is MCP? Model Context Protocol Explained Simply

Gemma 4 12B: 16GB VRAM Local Test

Gemma 4 12B: 16GB VRAM Local Test

Qwen3.6 27B vs 35B Unsloth on RTX 3090s | Head-to-Head

Qwen3.6 27B vs 35B Unsloth on RTX 3090s | Head-to-Head

Over 3x Faster AI. MTP Explained, Deployed & Benchmarked on Gemma 4 & Qwen 3.6.

Over 3x Faster AI. MTP Explained, Deployed & Benchmarked on Gemma 4 & Qwen 3.6.

Why Developers Are Switching to Qwen 3.7 Over GPT & Claude

Why Developers Are Switching to Qwen 3.7 Over GPT & Claude

Hermes Agent : Full Review and Test

Hermes Agent : Full Review and Test

Qwen3.6 35B-A3B Full Test – Is THIS the Best LOCAL Model Yet?

Qwen3.6 35B-A3B Full Test – Is THIS the Best LOCAL Model Yet?

Qwen3.6-35B-A3B vs Gemma4-26B: Quantized Local Showdown on Ollama

Qwen3.6-35B-A3B vs Gemma4-26B: Quantized Local Showdown on Ollama

Ollama vs LM Studio vs llama.cpp: Which Should You Use?

Ollama vs LM Studio vs llama.cpp: Which Should You Use?

Qwen3.6 vs Gemma 4: Which Actually Remembers Your Code?

Qwen3.6 vs Gemma 4: Which Actually Remembers Your Code?

GPT 5.5 Codex Review: Worth the Price Hike?

GPT 5.5 Codex Review: Worth the Price Hike?

AI Race: Why China Will Win!

AI Race: Why China Will Win!

Qwen 3.6:27B vs. Gemma 4:31B - RTX 5090 benchmark. Which is better?

Qwen 3.6:27B vs. Gemma 4:31B - RTX 5090 benchmark. Which is better?

Qwen 3.6 27B is a MONSTER, but can it run locally? I tested it on an RTX 5090 and RTX 5060 Ti and...

Qwen 3.6 27B is a MONSTER, but can it run locally? I tested it on an RTX 5090 and RTX 5060 Ti and...

Build Powerful Local Coding Agent on Budget GPU with Llama.cpp and Pi

Build Powerful Local Coding Agent on Budget GPU with Llama.cpp and Pi

Qwen3.6 27B Is INSANE – Is This a LOCAL Claude Opus Competitor?

Qwen3.6 27B Is INSANE – Is This a LOCAL Claude Opus Competitor?

Best Local Coding AI for 8GB VRAM (2026 Benchmark)

Best Local Coding AI for 8GB VRAM (2026 Benchmark)

here's REALLY WHY Fable 5 got banned

here's REALLY WHY Fable 5 got banned