CMU Advanced NLP 2024 (5): Transformers

This lecture (by Graham Neubig) for CMU CS 11-711, Advanced NLP (Spring 2024) covers: Transformer Architecture Multi-Head Attention Positional Encodings Layer Normalization Optimizers and Training LLaMa Architecture Class Site: https://phontron.com/class/anlp2024/