Gemma 4 12B QAT vs non-QAT - 16GB VRAM Local LLM setup

In this video I am testing the QAT version of the Gemma 4 12B model from Google and comparing the quality of the QAT from Unsloth (which is q4) vs the regular q4 GGUF from Unsloth. The model is running on a local AI PC I have built with 16GB VRAM and 32GB DDR4 RAM. I run the model through a few tests which are: 1. Adherence 2. Agency 3. Coding 4. Memory If you're interested in local LLMs, AI and homelabs from the perspective of a software engineer with many years of professional experience working with LLMs in production - feel free to subscribe! Models - • QAT: https://huggingface.co/unsloth/gemma-... • non-QAT: https://huggingface.co/unsloth/gemma-... GitHub: https://github.com/lukesdevlab/youtube Patreon: / lukesdevlab #localllm #localai #homelab #llamacpp #homelab #gemma4 #quantization #qat Chapters: 0:00 Coming up 0:08 Intro 0:55 Models 1:16 Tests 1:39 System Specs 1:50 Adherence - q4 2:53 Adherence - QAT 3:35 Agency 5:56 Coding - q4 7:55 Coding - QAT 10:55 Memory 12:40 Conclusion

Gemma 4 26B A4B QAT vs non-QAT - 16GB Local LLM setup

Gemma 4 26B A4B QAT vs non-QAT - 16GB Local LLM setup

Gemma 4 12B: The First "Encoder-Free" AI, Explained

Gemma 4 12B: The First "Encoder-Free" AI, Explained

Ideogram 4.0 image model, ByteDance Ditches Nvidia GPU, Google's 16GB Multimodal Beast AI News

Ideogram 4.0 image model, ByteDance Ditches Nvidia GPU, Google's 16GB Multimodal Beast AI News

I Tested the Cheapest Path to 96GB of VRAM

I Tested the Cheapest Path to 96GB of VRAM

Gemma 4 12B on a 16GB Mac Mini Is Surprisingly Capable

Gemma 4 12B on a 16GB Mac Mini Is Surprisingly Capable

Gemma 4 12B Reviewed and Tested - 16GB Local LLM setup

Gemma 4 12B Reviewed and Tested - 16GB Local LLM setup

Who is Staying on Linux, Who is Going Back to Windows? - Linux Challenge Pt 4

Who is Staying on Linux, Who is Going Back to Windows? - Linux Challenge Pt 4

The Best Local Agentic Coding Workflow (Complete Guide)

The Best Local Agentic Coding Workflow (Complete Guide)

I am done with Golang

I am done with Golang

How DeepSeek V4 fits on a laptop and what does it mean to us?

How DeepSeek V4 fits on a laptop and what does it mean to us?

Exposing The Solid State Donut Battery. It's Over.

Exposing The Solid State Donut Battery. It's Over.

I Made Opus 4.8 and Fable 5 Build the Same App (RAW RESULTS)

I Made Opus 4.8 and Fable 5 Build the Same App (RAW RESULTS)

Gemma 4 26B A4B vs Qwen 3.6 35B A3B - 16GB Local LLM setup

Gemma 4 26B A4B vs Qwen 3.6 35B A3B - 16GB Local LLM setup

Why Everyone Is Freaking Out About Mythos

Why Everyone Is Freaking Out About Mythos

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

Gemma 4 12B Is INSANE – Is THIS the BEST Local Coding Model Yet?

Gemma 4 12B Is INSANE – Is THIS the BEST Local Coding Model Yet?

Google Just Solved the Biggest Problem with Local AI (Gemma 4 12B)

Google Just Solved the Biggest Problem with Local AI (Gemma 4 12B)

Fable JUST made EVERYONE MAD...

Fable JUST made EVERYONE MAD...

Why Google Just Gave Away Gemma 4 for Free

Why Google Just Gave Away Gemma 4 for Free

NVIDIA Monopoly is DEAD | OPEN-SOURCE Chips Are HERE!

NVIDIA Monopoly is DEAD | OPEN-SOURCE Chips Are HERE!