Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools

New Technologies in Mathematics Seminar Speaker: Timo Schick, Meta AI Title: Toolformer: Language Models Can Teach Themselves to Use Tools Abstract: Language models exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this talk, we show how these limitations can be overcome by letting language models teach themselves to use external tools via simple APIs. We discuss Toolformer, a model trained to independently decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. Through this, it achieves substantially improved zero-shot performance across a variety of downstream tasks without sacrificing its core language modeling abilities.

Jimmy Ba | How to steer foundation models?

Jimmy Ba | How to steer foundation models?

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Large Language Models (LLMs) - Everything You NEED To Know

Large Language Models (LLMs) - Everything You NEED To Know

"AI Slop, SGD, and Multi-Index Models" – Ohad Shamir, Colloquium

"AI Slop, SGD, and Multi-Index Models" – Ohad Shamir, Colloquium

AlphaFold - The Most Useful Thing AI Has Ever Done

AlphaFold - The Most Useful Thing AI Has Ever Done

Training Sand to Think: Artificial General Intelligence & Future of Physics

Training Sand to Think: Artificial General Intelligence & Future of Physics

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Andrea Montanari | Self-induced regularization from linear regression to neural networks

Andrea Montanari | Self-induced regularization from linear regression to neural networks

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

AI Agents for Beginners – Part 1 (Free Labs)

AI Agents for Beginners – Part 1 (Free Labs)

Dan Freed | First Proof Introduction

Dan Freed | First Proof Introduction

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

AI can't cross this line and we don't know why.

AI can't cross this line and we don't know why.

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Toolformer

Toolformer

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)