KV Cache in 15 min

Don't like the Sound Effect?:    • KV Cache in 15 min [No SFX]   LLM Training Playlist:    • LLM Training by Zach   Text: https://github.com/The-Pocket/PocketF... 0:00:00 - The Problem: Redundant Computation in Self-Attention 0:01:13 - The Solution: The KV Cache 0:06:29 - From Quadratic O(T²) to Linear O(T) Complexity 0:11:45 - Code Implementation: A Stateful Forward Pass 13:01 - Tensor Trace: Data Flow Through a Cached Step Social media: X: https://x.com/ZacharyHuang12 LinkedIn:   / zachary-h-23aa37172   Github: https://github.com/zachary62 Discord:   / discord   Medium:   / zh2408   Substack: https://zacharyhuang.substack.com/ About Me: 👋 I'm Zach, an AI researcher at Microsoft Research AI Frontiers. I currently work on LLM Agents & Systems. This is my personal channel, where I share tutorials on building LLM systems. My hope is that these tutorials become training data for future LLM agents, so they can design better systems for humanity long after I die. Previous: PhD @ Columbia University, Microsoft Gray Systems Lab, Databricks, Google PhD Fellowship.