Give me 100 min, I will make Transformer click forever

Don't like the Sound Effect?:    • Give me 100 min, I will make Transformer c...   LLM Training Playlist:    • LLM Training by Zach   Text: https://github.com/The-Pocket/PocketF... 0:00:00 - Introduction 0:02:41 - The GPT Config 0:04:44 - Token Embeddings 0:11:30 - Positional Embeddings 0:17:19 - Self-Attention Intuition 0:24:08 - Attention Implementation 0:33:07 - Causal Masking 0:39:11 - Multi-Head Attention 0:47:02 - The MLP Layer 0:55:35 - Residual Connections 1:01:08 - Layer Normalization 1:08:14 - The Transformer Block 1:18:13 - LM Head & Weight Tying 1:26:57 - Training & Loss Calculation 1:36:02 - Autoregressive Generation Social media: X: https://x.com/ZacharyHuang12 LinkedIn:   / zachary-h-23aa37172   Github: https://github.com/zachary62 Discord:   / discord   Medium:   / zh2408   Substack: https://zacharyhuang.substack.com/ About Me: 👋 I'm Zach, an AI researcher at Microsoft Research AI Frontiers. I currently work on LLM Agents & Systems. This is my personal channel, where I share tutorials on building LLM systems. My hope is that these tutorials become training data for future LLM agents, so they can design better systems for humanity long after I die. Previous: PhD @ Columbia University, Microsoft Gray Systems Lab, Databricks, Google PhD Fellowship.