Gemma 4 12B: o modelo PERFEITO para rodar localmente?

When Google released the Gemma 4 12B, the first thing that crossed my mind was: do we finally have a serious model to run on a 16GB VRAM card? So I got it running on my RTX 5060 Ti, via llama.cpp, and tested each of this model's capabilities, one by one. Vision, audio, code. And in each test, we learn something different about what it means for a model to be "smart" vs. simply "big". There's a scene in this video that might change the way you think about small models. I won't give it away beforehand, but it says a lot about how we're underestimating what fits in 16GB. If you follow local LLMs, this video is worth watching until the end. _______ ⚡Hardware ⏵ ZimaCube: https://shop.zimaspace.com/products/z... urce=ytAigemma412b&utm_source=igor-oliveira&utm_medium=affiliate&utm_campaign=zimaspace_affiliate&utm_content=Igor-Oliveira ⏵ ZimaBoard 2: https://shop.zimaspace.com/products/z... ⚠️ Use coupon: AIProgBr15 to receive a $15.00 discount ⏵ Dock to connect the Video Card to the ZimaBoard 2: https://shop.zimaspace.com/products/g... ⏵ ZimaBlade: https://shop.zimaspace.com/products/z... e=ytAigemma412b&utm_source=igor-oliveira&utm_medium=affiliate&utm_campaign=zimaspace_affiliate&utm_content=Igor-Oliveira ⚡ Coding Tools: ⏵ GLM - https://z.ai/subscribe?ic=1CUTHGZM5X ⏵ AI Coding Plan - https://www.alibabacloud.com/en/campa... ⏵ Alibaba Cloud - https://www.alibabacloud.com/en/campa... ⚡ TTS ⏵ ElevenLabs - Text-to-Speech: https://try.elevenlabs.io/aiprogbr-tts ⏵ ElevenLabs - Voice Clone: https://try.elevenlabs.io/aiprogbr-vo... _______ 🔹 Want to learn how to create apps from scratch to publication with me? So check out my course: https://programadorbr.com/flutter?src... _______ 0:00 Intro 0:24 Documentation 3:08 Vision Test 6:33 Audio Test 9:09 Flappy Bird 10:32 Paint Brush 13:47 Conclusion _______ 👀 If you're curious, take a look at some of my published apps: ⏵ Teleprompter: https://apps.apple.com/app/apple-stor... ⏵ Pomodoro: https://apps.apple.com/app/apple-stor... ⏵ Infinite Tic-Tac-Toe iOS: https://apps.apple.com/app/apple-stor... ⏵ Reverse Play iOS: https://apps.apple.com/ca/app/reverse... _______ ✴️ Also follow my other channels: ⏵ Channel about the Apple universe: / @iprogbr ⏵ Programming channel: / @programadorbr ⏵ Instagram profile: / progbr ⏵ Substack profile: https://progbr.substack.com/ Some of the links in this description may be affiliate links, meaning they may generate commission for the channel at no extra cost to you.

Gemma 4 12B on a 16GB Mac Mini Is Surprisingly Capable

Gemma 4 12B on a 16GB Mac Mini Is Surprisingly Capable

MiniMax M3 REBORN: multimodal, 1M tokens of CONTEXT and trained on 100T

MiniMax M3 REBORN: multimodal, 1M tokens of CONTEXT and trained on 100T

Master Gemma 4 in 20 Minutes

Master Gemma 4 in 20 Minutes

Will the Intel Arc Pro B70 32GB replace the Nvidia RTX 5090? I tested it with LLM and image gener...

Will the Intel Arc Pro B70 32GB replace the Nvidia RTX 5090? I tested it with LLM and image gener...

BITCOIN’S TICKING TIME BOMB | Michael Saylor broke his promise (it’s just the beginning)

BITCOIN’S TICKING TIME BOMB | Michael Saylor broke his promise (it’s just the beginning)

RTX Spark: NVIDIA is trying to HIDE a detail that changes EVERYTHING for local LLMs

RTX Spark: NVIDIA is trying to HIDE a detail that changes EVERYTHING for local LLMs

How to use MTP in llama.cpp to ABSURDLY increase the speed of your local AI

How to use MTP in llama.cpp to ABSURDLY increase the speed of your local AI

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

Running a 35B AI Model on 6GB VRAM, FAST (llama.cpp Guide)

AMD Ryzen AI Max 128GB is NOT worth it for local LLM.

AMD Ryzen AI Max 128GB is NOT worth it for local LLM.

Free text-to-speech software that speaks Portuguese and runs on the CPU (ElevenLabs can't do it):...

Free text-to-speech software that speaks Portuguese and runs on the CPU (ElevenLabs can't do it):...

My complete setup for running local AI in 2026 (LLM, Image, Voice, Video)

My complete setup for running local AI in 2026 (LLM, Image, Voice, Video)

NVIDIA Just Slapped Apple Silicon - RTX Spark

NVIDIA Just Slapped Apple Silicon - RTX Spark

RTX Spark Is Already Making People Mad

RTX Spark Is Already Making People Mad

I ran Qwen 3.6 35B on 8GB of VRAM at almost 20 t/s (COMPLETE TUTORIAL using llama.cpp)

I ran Qwen 3.6 35B on 8GB of VRAM at almost 20 t/s (COMPLETE TUTORIAL using llama.cpp)

The Real Reason Millions of Users Are Abandoning Google Today

The Real Reason Millions of Users Are Abandoning Google Today

How to get a free VPS server forever with no hacks

How to get a free VPS server forever with no hacks

I built a private AI mini-cluster with Framework Desktop

I built a private AI mini-cluster with Framework Desktop

Lucas Montano vs Claude

Lucas Montano vs Claude

DeepSeek V4 Flash LOCAL é bom de CÓDIGO até com o thinking desligado

DeepSeek V4 Flash LOCAL é bom de CÓDIGO até com o thinking desligado

I Hacked This Temu Router. What I Found Should Be Illegal.

I Hacked This Temu Router. What I Found Should Be Illegal.