5. Comparing Quantizations of the Same Model - Ollama Course

Welcome back to the Ollama course! In this lesson, we dive into the fascinating world of AI model quantization. Using variations of the llama3.1 model, we explore how different quantization levels affect performance and output quality. Through this video, you'll gain a deeper understanding of how to choose the right quantization for how you use AI models, ensuring you get the best performance and results for your specific needs. Don't forget to subscribe for more lessons in this free Ollama course! Thanks for watching! My Links 🔗 👉🏻 Subscribe (free): / technovangelist 👉🏻 Join and Support: / @technovangelist 👉🏻 Newsletter: https://technovangelist.substack.com/... 👉🏻 Twitter: / technovangelist 👉🏻 Discord: / discord 👉🏻 Patreon: / technovangelist 👉🏻 Instagram: / technovangelist 👉🏻 Threads: https://www.threads.net/@technovangel... 👉🏻 LinkedIn: / technovangelist 👉🏻 All Source Code: https://github.com/technovangelist/vi... Want to sponsor this channel? Let me know what your plans are here: https://www.technovangelist.com/sponsor 00:00 - Start with an example 00:24 - Introduction 00:56 - Lots of claims on the Discord 01:26 - Intro to the app 01:57 - Where to find the code 02:20 - Grab a few quantizations 02:57 - You should regularly pull the models again 03:30 - Back to the Black Hole answers 04:39 - The classic logic problem 05:35 - How about function calling 08:31 - How about for prompts with more reasoning 09:01 - Are those questions stupid? 09:30 - Which quant to use?

6. An Introduction to RAG - Part of the Free Ollama Course

6. An Introduction to RAG - Part of the Free Ollama Course

How LLMs survive in low precision | Quantization Fundamentals

How LLMs survive in low precision | Quantization Fundamentals

What is Vibe Coding?

What is Vibe Coding?

Optimize Your AI Models

Optimize Your AI Models

They Lied to You About AI (This Study Proves It)

They Lied to You About AI (This Study Proves It)

Can a Small Local AI Model Do Real Work? Python + Ollama Agent Template

Can a Small Local AI Model Do Real Work? Python + Ollama Agent Template

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

1. The Ollama Course: Intro to Ollama

1. The Ollama Course: Intro to Ollama

This Local LLM Looked Smart Until I Saw What It Made Up

This Local LLM Looked Smart Until I Saw What It Made Up

Every Large Language Model Explained in 17 Minutes!

Every Large Language Model Explained in 17 Minutes!

The End of the GPU Era? 1-Bit LLMs Are Here.

The End of the GPU Era? 1-Bit LLMs Are Here.

Running LLMs Locally Just Got Way Better - Ollama + MCP

Running LLMs Locally Just Got Way Better - Ollama + MCP

Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

7. Embeddings in Depth - Part of the Ollama Course

7. Embeddings in Depth - Part of the Ollama Course

Optimize Your AI - Quantization Explained

Optimize Your AI - Quantization Explained

The Best Local Agentic Coding Workflow (Complete Guide)

The Best Local Agentic Coding Workflow (Complete Guide)

AI Bubble: How AI's push towards IPOs became a death drive | Ed Zitron

AI Bubble: How AI's push towards IPOs became a death drive | Ed Zitron

Quantum Just Killed AI Data Centers

Quantum Just Killed AI Data Centers

Upgrade Your AI Using Web Search - The Ollama Course

Upgrade Your AI Using Web Search - The Ollama Course

4. The Ollama Course - Using the CLI

4. The Ollama Course - Using the CLI