Lec 08. Architectures: Transformers

MIT 6.7960 Deep Learning, Fall 2024 Instructor: Phillip Isola View the complete course: https://ocw.mit.edu/courses/6-7960-de... YouTube Playlist: • MIT 6.7960 Deep Learning, Fall 2024 This video introduces transformers, focusing on three key ideas: tokens, attention, and positional codes. It also explores how transformers relate to MLPs, GNNs, and CNNs as variations on common principles. License: Creative Commons BY-NC-SA More information at https://ocw.mit.edu/terms More courses at https://ocw.mit.edu Support OCW at http://ow.ly/a1If50zVRlQ We encourage constructive comments and discussion on OCW’s YouTube and other social media channels. Personal attacks, hate speech, trolling, and inappropriate comments are not allowed and may be removed. More details at https://ocw.mit.edu/comments.

Lec 09. Hacker's Guide to Deep Learning

Lec 09. Hacker's Guide to Deep Learning

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Attention in transformers, step-by-step | Deep Learning Chapter 6

Attention in transformers, step-by-step | Deep Learning Chapter 6

Conan O’Brien Delivers the Commencement Address | Harvard Commencement 2026

Conan O’Brien Delivers the Commencement Address | Harvard Commencement 2026

Lec 10. Architectures: Memory

Lec 10. Architectures: Memory

Lec 01. Introduction to Deep Learning

Lec 01. Introduction to Deep Learning

How DeepSeek Rewrote the Transformer [MLA]

How DeepSeek Rewrote the Transformer [MLA]

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

MIT 6.S191: Reinforcement Learning

MIT 6.S191: Reinforcement Learning

AlphaFold - The Most Useful Thing AI Has Ever Done

AlphaFold - The Most Useful Thing AI Has Ever Done

Nothing about the honey badger is normal... and here is why

Nothing about the honey badger is normal... and here is why

Yann LeCun: World Models: Enabling the next AI revolution

Yann LeCun: World Models: Enabling the next AI revolution

How to Speak

How to Speak

MIT 6.S184: Flow Matching and Diffusion Models - Lecture 04 - Latent Spaces, Neural networks (2026)

MIT 6.S184: Flow Matching and Diffusion Models - Lecture 04 - Latent Spaces, Neural networks (2026)

MIT 6.S191: Deep Generative Modeling

MIT 6.S191: Deep Generative Modeling

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Lec 05. Architectures: Graphs

Lec 05. Architectures: Graphs

MIT 6.S191: Convolutional Neural Networks

MIT 6.S191: Convolutional Neural Networks

What Happens When a Black Hole Quietly Becomes Its Own Opposite

What Happens When a Black Hole Quietly Becomes Its Own Opposite

Transformers, the tech behind LLMs | Deep Learning Chapter 5

Transformers, the tech behind LLMs | Deep Learning Chapter 5