"Efficient Finetuning of Large Language Models via Large-Width Analysis" - Soufiane Hayou

Abstract: Finetuning Large Language Models (LLMs) enhances their performance on downstream tasks — a desirable outcome if the model is used for a specific task. Parameter-efficient finetuning methods such as LoRA (Low-Rank Adaptation) are popular because they allow finetuning large models with relatively low cost. When using LoRA, two hyperparameters critically shape learning: learning rates and initialization. In this talk, I’ll present several results on the role of initialization and learning rate in LoRA finetuninf and distill these insights into practical defaults. Bio: Soufiane Hayou is currently an assistant professor at Johns Hopkins in the department of Applied Mathematics and Statistics with a secondary appointment at the Computer Science department. He is also a member of the Data Science and AI Institute. Previously, he was a research fellow at Simons Institute, UC Berkeley, and a visiting assistant professor of mathematics at the National University of Singapore. He obtained his PhD in statistics and machine learning in 2021 from the University of Oxford and graduated from Ecole Polytechnique in 2018 before joining Oxford. His research is mainly focused on the theory and practice of learning at scale: theoretical analysis of large-scale neural networks with the goal of obtaining principled methods for training/finetuning.

Leveraging Large Speech Language Models as Evaluators for Expressive Speech - Bismarck Odoom

Leveraging Large Speech Language Models as Evaluators for Expressive Speech - Bismarck Odoom

Koji Hashimoto (Kyoto University) - Quantum Gravity and Machine Learning

Koji Hashimoto (Kyoto University) - Quantum Gravity and Machine Learning

How AI Learned to Teach Itself [JEPA]

How AI Learned to Teach Itself [JEPA]

AI Bubble Will Burst Eventually Says Bridgewater's Ray Dalio

AI Bubble Will Burst Eventually Says Bridgewater's Ray Dalio

"Computer Vision Beyond Task Performance" – Raymond Yeh, TTIC Colloquium

"Computer Vision Beyond Task Performance" – Raymond Yeh, TTIC Colloquium

Belgien – Ägypten Highlights | Gruppe G, FIFA WM 2026 | sportstudio

Belgien – Ägypten Highlights | Gruppe G, FIFA WM 2026 | sportstudio

NYC's Joyous Knicks Victory Celebration vs. Trump's Joyless White House UFC Fight | The Daily Show

NYC's Joyous Knicks Victory Celebration vs. Trump's Joyless White House UFC Fight | The Daily Show

"In-distribution AI-generated literature for cultural simulation" - Matthew Wilkens

"In-distribution AI-generated literature for cultural simulation" - Matthew Wilkens

HIIT Guest Lecture 21.5.2026: Marko Turpeinen (CEO of 1001 Lakes)

HIIT Guest Lecture 21.5.2026: Marko Turpeinen (CEO of 1001 Lakes)

Training Sand to Think: Artificial General Intelligence & Future of Physics

Training Sand to Think: Artificial General Intelligence & Future of Physics

Something is jamming GPS over Europe. Here's what we found

Something is jamming GPS over Europe. Here's what we found

Section C: Marginal Structural Models

Section C: Marginal Structural Models

Iran – Neuseeland Highlights | Gruppe G, FIFA WM 2026 | sportstudio

Iran – Neuseeland Highlights | Gruppe G, FIFA WM 2026 | sportstudio

"AI Slop, SGD, and Multi-Index Models" – Ohad Shamir, Colloquium

"AI Slop, SGD, and Multi-Index Models" – Ohad Shamir, Colloquium

"Optimizer Geometry in Modern Deep Learning" – Zhiyuan Li, Research at TTIC

"Optimizer Geometry in Modern Deep Learning" – Zhiyuan Li, Research at TTIC

Co-Creator of Haskell: Why Learn Functional Programming, Useless vs Useful Languages | Simon Jones

Co-Creator of Haskell: Why Learn Functional Programming, Useless vs Useful Languages | Simon Jones

6. Monte Carlo Simulation

6. Monte Carlo Simulation

25 minutes of silence

25 minutes of silence

The future of intelligence | Demis Hassabis (Co-founder and CEO of DeepMind)

The future of intelligence | Demis Hassabis (Co-founder and CEO of DeepMind)

Tommi Jaakkola: Elements of inference

Tommi Jaakkola: Elements of inference