Optimizing Models: Finetuning, Distillation, LoRA, and QLoRA[Lecture]
This is a single lecture from a course. If you you like the material and want more context (e.g., the lectures that came before), check out the whole course: https://users.umiacs.umd.edu/~jbg/tea... (Including homeworks and reading.) Why I call models Muppet Models: • What general term should you use for model... Music: / review-and-rest

▶︎
What is Low-Rank Adaptation (LoRA) | explained by the inventor

▶︎
Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI
![Using DSPy for Prompt Optimization in Python: Example of Calibrating Quiz Bowl Questions [Lecture]](https://i.ytimg.com/vi/sG3Tz0-Vw58/hqdefault.jpg?sqp=-oaymwE9CNACELwBSFryq4qpAy8IARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYZSBOKE8wDw==&rs=AOn4CLDB1So5sn6zCNBZLUO0XBVC5cWUVA)
▶︎
Using DSPy for Prompt Optimization in Python: Example of Calibrating Quiz Bowl Questions [Lecture]
![Why using a dumb language model can speed up a smarter one: Speculative Decoding [Lecture]](https://i.ytimg.com/vi/VnvhD8_E7AQ/hqdefault.jpg?sqp=-oaymwE9CNACELwBSFryq4qpAy8IARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYciA9KEowDw==&rs=AOn4CLC0T2WXEipmQycbg5vIwB8e_jjK7g)
▶︎
Why using a dumb language model can speed up a smarter one: Speculative Decoding [Lecture]

▶︎
The biggest lie about the double slit experiment

▶︎
'Listen Like You Might Be Wrong': Harvard Student Goes Viral For Stunning Speech On Trump Amid Feud

▶︎
Inference, Diffusion, World Models, and More | YC Paper Club

▶︎
Mathe-News 🚨 KI löst das Erdős-Einheitsabstand-Problem!

▶︎
Martin Hairer, Yang-Mills and the Mass Gap

▶︎
The ASML Replacement Nobody Saw Coming

▶︎
The Insane Genius of a Formula 1 Gearbox

▶︎
MIT 6.S191: Secrets of Massively Parallel Training

▶︎
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

▶︎
Train Your Brain to Never Forget (5 Feynman Habits)

▶︎
MIT 6.S191: AI for Science

▶︎
We finally understood orbital shapes intuitively! (My mind is blown)

▶︎
Yann LeCun's $1B Bet Against LLMs

▶︎
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
![Adam / AdamW: How the "Default" Optimizer is Different from SGD [Lecture]](https://i.ytimg.com/vi/ascWFMmSF2w/hqdefault.jpg?sqp=-oaymwE9CNACELwBSFryq4qpAy8IARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYUyBLKGUwDw==&rs=AOn4CLCenXWImE2io2rGJX91kPHjFsSC0Q)
▶︎
Adam / AdamW: How the "Default" Optimizer is Different from SGD [Lecture]

▶︎
