Local AI on a budget GPU: Qwen 3.6 35B and 27B tested

Is a powerful local AI assistant finally within reach for your desktop PC, even on a budget? This video tests the new Qwen 3.6 models (35B and 27B) on both a modern budget GPU (RTX 5060 Ti 16GB) and an older card (GTX 1060 6GB), simulating real-world demands from agentic harnesses. We reveal the performance benchmarks, discuss hardware needs, and compare local costs against API usage, answering if you can leverage cutting-edge AI without breaking the bank. 00:00 Intro 00:56 What harnesses want(ed) 04:05 How should we benchmark? 05:37 What kind of hardware do you need? 08:34 The GTX 1060 6GB still chugs 11:02 Perhaps check your case can fit your GPU first... 11:19 The other hardware 12:30 Results - Ollama 13:13 Results - Llama.cpp 15:17 Cost comparison - local vs API 17:13 How to run both 35b and 27b? 18:42 Conclusion - are we there yet? 20:30 Bonuses 21:14 Outro 🚀 Ready to implement AI for your business? Join our community for in-depth discussions, Q&A, and support from like-minded peers: https://aegis.social/skool