Stanford CS330: Multi-Task and Meta-Learning, 2019 | Lecture 7 - Kate Rakelly (UC Berkeley)

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai Kate Rakelly (UC Berkeley) Guest Lecture in Stanford CS330 http://cs330.stanford.edu/ 0:00 Introduction 0:17 Lecture outline 1:07 Recap: meta-reinforcement learning 3:55 What's different in RL? 5:33 PG meta-RL algorithms: recurrent Implement the policy as a recurrent network, train 7:41 PG meta-RL algorithms: gradients 9:57 How these algorithms learn to explore 15:27 What's the problem? 22:45 Meta-RL desiderata 28:43 Model belief over latent task variables POMDP for unobserved state 33:49 Posterior sampling in action 35:07 Meta-RL with task-belief states 38:18 Encoder design 43:45 Integrating task-belief with SAC 46:23 Separate task-Inference and RL data 52:16 Limits of posterior sampling 55:06 Summary

Stanford CS330: Multi-Task and Meta-Learning, 2019 | Lecture 8 - Model-Based Reinforcement Learning

Stanford CS330: Multi-Task and Meta-Learning, 2019 | Lecture 8 - Model-Based Reinforcement Learning

Stanford CS330: Multi-Task and Meta-Learning, 2019 | Lecture 9 - Lifelong Learning

Stanford CS330: Multi-Task and Meta-Learning, 2019 | Lecture 9 - Lifelong Learning

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

How AI Cracked the Protein Folding Code and Won a Nobel Prize

How AI Cracked the Protein Folding Code and Won a Nobel Prize

LIVE: Conan O’Brien speaks at Harvard graduation ceremony (full)

LIVE: Conan O’Brien speaks at Harvard graduation ceremony (full)

Trump Preps for 80th Birthday, Threatens to Hit Iran, Knicks Historic Win & Elon Musk Trillionaire!?

Trump Preps for 80th Birthday, Threatens to Hit Iran, Knicks Historic Win & Elon Musk Trillionaire!?

Something is jamming GPS over Europe. Here's what we found

Something is jamming GPS over Europe. Here's what we found

Sergey Levine (UC Berkeley): Robot Foundation Models

Sergey Levine (UC Berkeley): Robot Foundation Models

I turned an old van into a 2-STORY tiny house

I turned an old van into a 2-STORY tiny house

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Web Scraping Using Python For Beginners and File Handling in Python | Python Web Scraping

Web Scraping Using Python For Beginners and File Handling in Python | Python Web Scraping

Nobel Prize lecture: Demis Hassabis, Nobel Prize in Chemistry 2024

Nobel Prize lecture: Demis Hassabis, Nobel Prize in Chemistry 2024

The Match That Made Brazilians Hate Germany

The Match That Made Brazilians Hate Germany

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Demystifying the Higgs Boson with Leonard Susskind

Demystifying the Higgs Boson with Leonard Susskind

GPS: How it Finds You

GPS: How it Finds You

Why The World Cup Could Be A Disaster.

Why The World Cup Could Be A Disaster.

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

6. Monte Carlo Simulation

6. Monte Carlo Simulation