Diffusion Transformers (DiT) Explained: Replacing U-Nets with Transformers
Transformers revolutionized NLP and computer vision — but can they replace U-Nets in diffusion models? In this video, we break down the DiT (Diffusion Transformer) paper by William Peebles and Saining Xie, covering: How diffusion models work Why latent diffusion matters Patchifying latent representations Conditioning methods: In-context tokens Cross-attention adaLN / adaLN-Zero Why adaLN-Zero works so well Scaling laws in diffusion transformers Why GFlops matter more than parameter count State-of-the-art ImageNet results We also compare DiT against traditional U-Net diffusion architectures and explain why Transformers scale so effectively for image generation. Slides based on: “Scalable Diffusion Models with Transformers”

▶︎
Scalable Diffusion Models with Transformers | DiT Explanation and Implementation

▶︎
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

▶︎
Transformers, the tech behind LLMs | Deep Learning Chapter 5

▶︎
Yann LeCun: World Models: Enabling the next AI revolution

▶︎
What Nobody Tells You About Being a Quant

▶︎
GNN Explanations that do not Explain and Hot to Find Them

▶︎
AlphaFold - The Most Useful Thing AI Has Ever Done

▶︎
Full Archon Guide - Build AI Coding Harnesses That Actually Ship (LIVE)
![PINK & ORANGE GRADIENT IN HD [3 HOURS]](https://i.ytimg.com/vi/6ih8zppfQSQ/hqdefault.jpg?sqp=-oaymwE9CNACELwBSFryq4qpAy8IARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYfyAsKBMwDw==&rs=AOn4CLDvw6mQM98bfl572zfE7r4GdUG8dg)
▶︎
PINK & ORANGE GRADIENT IN HD [3 HOURS]

▶︎
DINOv3 Paper Explained: The Computer Vision Foundation Model
![Yann LeCun's $1B Bet Against LLMs [Part 1]](https://i.ytimg.com/vi/kYkIdXwW2AE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDbV4izF3i-wxevCVIn7FJjoy1vlA)
▶︎
Yann LeCun's $1B Bet Against LLMs [Part 1]

▶︎
Attention in transformers, step-by-step | Deep Learning Chapter 6

▶︎
Anthropic is Completely F*cked.

▶︎
Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

▶︎
You Know This Song (but the Orchestra Doesn’t) | Jacob Collier & VSO School of Music Orchestra | TED

▶︎
Instant Focus Mode – 40Hz Gamma Brainwave Music for Deep Focus & Productivity

▶︎
Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

▶︎
Why are diffusion LLMs so fast?

▶︎
Don't learn AI Agents without Learning these Fundamentals

▶︎
