But What Are Transformers?
Transformers is arguably the most influential neural network architecture in the last decade, powering the current boom of generative AI. In this video, we will review the basic ideas of the original encoder-decoder transformer architecture and understand how various design decisions are made. Enjoy! Slides download: https://www.dropbox.com/scl/fi/x7zkyd...
![How Attention Got So Efficient [GQA/MLA/DSA]](https://i.ytimg.com/vi/Y-o545eYjXM/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLBuOQf8Rw0rEDbSy5MucgJ2Vh6xGw)
▶︎
How Attention Got So Efficient [GQA/MLA/DSA]

▶︎
Transformers, the tech behind LLMs | Deep Learning Chapter 5

▶︎
The 60-Year Hunt for AI's Most Important Function
![This Simple Optimizer Is Revolutionizing How We Train AI [Muon]](https://i.ytimg.com/vi/bO5nvE289ec/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLAzFxNYWuTGV6zIBHgFHXfRMkBUNg)
▶︎
This Simple Optimizer Is Revolutionizing How We Train AI [Muon]

▶︎
Transformers and Self-Attention (DL 19)

▶︎
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

▶︎
Vision Transformer Basics

▶︎
Why Transformers Need Positional Encoding | Sin & Cos Explained Visually

▶︎
Attention in transformers, step-by-step | Deep Learning Chapter 6

▶︎
The Most Underrated Layer Inside Every AI Model

▶︎
Stanford CS25: V1 I Transformers United: DL Models that have revolutionized NLP, CV, RL

▶︎
How does AI actually work? Transformers explained

▶︎
Transformer Neural Networks - EXPLAINED! (Attention is all you need)

▶︎
🇩🇪 German industry JUST died (it’s WORSE than you think)

▶︎
How might LLMs store facts | Deep Learning Chapter 7

▶︎
Why Inference is hard..
![The Misconception that Almost Stopped AI [How Models Learn Part 1]](https://i.ytimg.com/vi/NrO20Jb-hy0/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLCiksXndIEYQZVVoTfArQwhou-eWw)
▶︎
The Misconception that Almost Stopped AI [How Models Learn Part 1]

▶︎
Why are Transformers replacing CNNs?

▶︎
Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

▶︎
