Deep Dive Into The Toolformer

This week we cover the "Toolformer: Language Models Can Teach Themselves to Use Tools" paper from Meta and Universitat Pompeu Fabra. This paper shows how you can train your LLM to use tools like a calculator and calendar through API calls. -- Get Oxen 🐂 https://oxen.ai/ Oxen.ai makes versioning your datasets as easy as versioning your code! Even is millions of unstructured images, we quickly handle any type of data so you can build cutting-edge AI. -- Toolformer 📜 https://arxiv.org/abs/2302.04761 The Datasets 🔢 https://www.oxen.ai/Laurence/mlqa https://www.oxen.ai/Laurence/lama https://www.oxen.ai/Laurence/ASDiv https://www.oxen.ai/Laurence/SVAMP https://www.oxen.ai/Laurence/web_ques... https://www.oxen.ai/Laurence/MAWPS https://www.oxen.ai/Laurence/templama https://www.oxen.ai/datasets/OxenAI-P... Filtering Functions ✂️ https://github.com/lucidrains/toolfor... Toolformer Notes 📜 https://www.oxen.ai/blog/toolformer-l... Join Arxiv Dives 🤿 https://oxen.ai/community Discord 🗿 / discord -- Chapters 0:00 Intro to the Toolformer 6:40 Toolformer Architecture 7:43 Approach 9:39 Creating the Training Data 12:24 Generate API Call Data 13:36 Together AI Demo 15:35 Axiv Paper Examples 18:00 Execute API Calls 19:53 Filtering API Calls and Math 31:15 Experiments 32:12 Results 34:14 Scaling Laws 35:22 Questions

Deep dive into Mixture of Experts (MOE) with the Mixtral 8x7B paper

Deep dive into Mixture of Experts (MOE) with the Mixtral 8x7B paper

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

Unlocking Graph Neural Networks: A Hands-On Journey From Basics To Breakthroughs

Unlocking Graph Neural Networks: A Hands-On Journey From Basics To Breakthroughs

Don't learn AI Agents without Learning these Fundamentals

Don't learn AI Agents without Learning these Fundamentals

Yann LeCun: World Models: Enabling the next AI revolution

Yann LeCun: World Models: Enabling the next AI revolution

20260421 Nick Ballou - Open Play Data OGD OOH

20260421 Nick Ballou - Open Play Data OGD OOH

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

2026 SC INBRE Biostatistics Summer Courses Week 2, Day 5 (June 12)

2026 SC INBRE Biostatistics Summer Courses Week 2, Day 5 (June 12)

Gen AI Course | Gen AI Tutorial For Beginners

Gen AI Course | Gen AI Tutorial For Beginners

How RWKV-7 "Goose" and It's Linear Inference Work with Author Eugene Cheah

How RWKV-7 "Goose" and It's Linear Inference Work with Author Eugene Cheah

40Hz Binaural Gamma Waves - Ultra Deep Concentration

40Hz Binaural Gamma Waves - Ultra Deep Concentration

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools

Andrew Ng: Building Faster with AI

Andrew Ng: Building Faster with AI

RAG Crash Course for Beginners

RAG Crash Course for Beginners

The FASTEST introduction to Reinforcement Learning on the internet

The FASTEST introduction to Reinforcement Learning on the internet

How GPT-2 was trained - 🐂 🌾 Arxiv Dives w/ Oxen.ai

How GPT-2 was trained - 🐂 🌾 Arxiv Dives w/ Oxen.ai

The Limits of AI: Generative AI, NLP, AGI, & What’s Next?

The Limits of AI: Generative AI, NLP, AGI, & What’s Next?