Train Your Own Reasoning Model (DeepSeek Clone) Fast & With Only 7Gb Of VRAM
Hello everyone, I hope you're doing well! In this video, I show you how to fine-tune LLMs locally for the task of reasoning, using the reinforcement learning algorithm called GRPO. You can perform the fine tuning with a GPU of at least 7Gb of VRAM using the Unsloth fast fine-tuning python library. Used material links: Github Repo: https://github.com/Hmzbo/Fine-tune-LL... Hugging face post: https://huggingface.co/learn/cookbook... Unsloth notebooks: https://docs.unsloth.ai/get-started/u... Let's connect: LinkedIn: https://bit.ly/3roXgQ2 GitHub: https://bit.ly/3CrfRRP Kaggle: https://bit.ly/3C1mqZD Twitter: https://bit.ly/3UR06e3 -------------------------------------------------------------- ♪ Song: Memories Artist: Owl Nest Music by: CreatorMix.com Video: • Free Lofi Music For YouTube Videos No Copy... -------------------------------------------------------------- If you have any question, suggestion, or remark. Feel free to leave it in a comment below! Until next time, stay safe! #mlwh 00:00 Intro 01:02 Explaining GRPO 08:03 Environment Setup guidelines 10:20 Data , Model & Reward functions 17:57 Training 21:24 Training results 23:47 Testing

How to Train LLMs to "Think" (o1 & DeepSeek-R1)

God Says:"TAKE THIS MESSAGE SERIOUSLY, BECAUSE ONLY YOU ARE SEEING IT"/God Message Now/God Message

How to Fine-tune LLMs with Unsloth: Complete Guide

Feed Your OWN Documents to a Local Large Language Model!

Using Large Language Models | Build Your Own LLM Workshop #1

Linus Torvalds: AI Is Changing Linux Fast

I Trained an LLM to Think Deeper (Here's How)

DeepSeek R1 Theory Overview | GRPO + RL + SFT

EASIEST Way to Train LLM Train w/ unsloth (2x faster with 70% less GPU memory required)

The insane engineering of Deepseek V4

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

Transformers, the tech behind LLMs | Deep Learning Chapter 5

EASIEST Way to Fine-Tune a LLM and Use It With Ollama

How LLMs survive in low precision | Quantization Fundamentals

Can Reinforcement Learning Lead to AGI? - Daniel Han, Unsloth

Build Local LLM for OCR, Object Detection & Image Parsing with TOP Precision - LLM Python Project

LLMs Don't Need More Parameters. They Need Loops.

My Honest Thoughts about Deepseek

QWEN-3: EASIEST WAY TO FINE-TUNE WITH REASONING 🙌

