Transformers - Part 1 - Self-attention: an introduction
In this video, we briefly introduce transformers and provide an introduction to the intuition behind self-attention. The video is part of a series of videos on the transformer architecture, https://arxiv.org/abs/1706.03762. You can find the complete series and a longer motivation here: • A series of videos on the transformer Slides are available here: https://drive.google.com/file/d/1uCAw...

▶︎
Transformers - Part 2 - Self attention complete equations

▶︎
BERT: transfer learning for NLP

▶︎
Transformers - Part 3 - Encoder

▶︎
Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

▶︎
Scammers PANIC After I Tell Them Their REAL Names

▶︎
Attention in transformers, step-by-step | Deep Learning Chapter 6

▶︎
Terence Tao: Nobody Understands Why AI Actually Works

▶︎
Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

▶︎
The math behind Attention: Keys, Queries, and Values matrices

▶︎
Transformers, the tech behind LLMs | Deep Learning Chapter 5

▶︎
Semiconductors explained in 16 mins | Chris Miller

▶︎
Rasa Algorithm Whiteboard - Transformers & Attention 1: Self Attention

▶︎
Attention Is All You Need

▶︎
If You Have A Bad Memory, I’ll Help You Fix It In 28 Minutes

▶︎
Transformers, explained: Understand the model behind GPT, BERT, and T5

▶︎
Transformers - Part 4 - Encoder remarks

▶︎
Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

▶︎
C5W3L07 Attention Model Intuition

▶︎
CS480/680 Lecture 19: Attention and Transformer Networks

▶︎
