CMU Advanced NLP Spring 2025 (16): Parallelism and Scaling

This lecture (by Sean Welleck) for CMU CS 11-711, Advanced NLP covers: Basics of training on one GPU Parallelization on multiple GPUs (e.g., data, tensor, pipeline parallel) Combining and comparing strategies Content (including figures) based on The Ultra-Scale Playbook: https://huggingface.co/spaces/nanotro...

CMU Advanced NLP Spring 2025 (17): Long-Context Models

CMU Advanced NLP Spring 2025 (17): Long-Context Models

Lecture 48: The Ultra Scale Playbook

Lecture 48: The Ultra Scale Playbook

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Designing Data-intensive Applications with Martin Kleppmann

Designing Data-intensive Applications with Martin Kleppmann

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

Emergent Complexity

Emergent Complexity

CMU Advanced NLP Spring 2025 (2): Neural Text Representation and Classification

CMU Advanced NLP Spring 2025 (2): Neural Text Representation and Classification

AlphaFold - The Most Useful Thing AI Has Ever Done

AlphaFold - The Most Useful Thing AI Has Ever Done

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Demis Hassabis: We're Three Quarters of the Way to AGI

Demis Hassabis: We're Three Quarters of the Way to AGI

CMU Advanced NLP Spring 2025 (11): Reinforcement Learning

CMU Advanced NLP Spring 2025 (11): Reinforcement Learning

Why I Left Quantum Computing Research

Why I Left Quantum Computing Research

1: Introduction to Neural Networks and Deep Learning; Training Deep NNs

1: Introduction to Neural Networks and Deep Learning; Training Deep NNs

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

How AI Cracked the Protein Folding Code and Won a Nobel Prize

How AI Cracked the Protein Folding Code and Won a Nobel Prize

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

CMU Advanced NLP Fall 2025 (22): Test-Time Scaling Strategies

CMU Advanced NLP Fall 2025 (22): Test-Time Scaling Strategies

CMU Advanced NLP Spring 2025 (9): Fine-tuning

CMU Advanced NLP Spring 2025 (9): Fine-tuning

CMU Advanced NLP Fall 2024 (6): Instruction Tuning

CMU Advanced NLP Fall 2024 (6): Instruction Tuning

CMU Advanced NLP Spring 2025 (20): Advanced Post-Training

CMU Advanced NLP Spring 2025 (20): Advanced Post-Training