Stanford CS224N: NLP with Deep Learning | Winter 2020 | BERT and Other Pre-trained Language Models
For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/3waBO2R Jacob Devlin, Google AI Language https://research.google/people/106320/ Professor Christopher Manning Thomas M. Siebel Professor in Machine Learning, Professor of Linguistics and of Computer Science Director, Stanford Artificial Intelligence Laboratory (SAIL)

▶︎
Transfer Learning in Natural Language Processing (NLP) - PyCon SG 2019
![Yann LeCun's $1B Bet Against LLMs [Part 1]](https://i.ytimg.com/vi/kYkIdXwW2AE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDbV4izF3i-wxevCVIn7FJjoy1vlA)
▶︎
Yann LeCun's $1B Bet Against LLMs [Part 1]

▶︎
MIT 6.S191 (2020): Introduction to Deep Learning

▶︎
Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 14 - T5 and Large Language Models

▶︎
Ultra-scale playbook, ch.3.2 - "Sequence Parallelism"

▶︎
Andrej Karpathy: Software Is Changing (Again)

▶︎
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

▶︎
The 2025 Martin Lecture featuring Geoffrey Hinton — Boltzmann Machines

▶︎
40Hz Binaural Gamma Waves - Ultra Deep Concentration

▶︎
MIT 6.S191 (2021): Introduction to Deep Learning

▶︎
Yann LeCun: World Models: Enabling the next AI revolution
![[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han](https://i.ytimg.com/vi/OkEGJ5G3foU/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDALOTyyIB7iZX9LiUj82NSPuT6Hw)
▶︎
[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

▶︎
Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 2 – Word Vectors and Word Senses

▶︎
Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

▶︎
Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 6 – Language Models and RNNs

▶︎
Query, Key and Value Matrix for Attention Mechanisms in Large Language Models

▶︎
Attention in transformers, step-by-step | Deep Learning Chapter 6

▶︎
MIT Introduction to Deep Learning (2025) | 6.S191

▶︎
6. Monte Carlo Simulation

▶︎
