Stanford CS330: Multi-Task and Meta-Learning, 2019 | Lecture 7 - Kate Rakelly (UC Berkeley)

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai Kate Rakelly (UC Berkeley) Guest Lecture in Stanford CS330 http://cs330.stanford.edu/ 0:00 Introduction 0:17 Lecture outline 1:07 Recap: meta-reinforcement learning 3:55 What's different in RL? 5:33 PG meta-RL algorithms: recurrent Implement the policy as a recurrent network, train 7:41 PG meta-RL algorithms: gradients 9:57 How these algorithms learn to explore 15:27 What's the problem? 22:45 Meta-RL desiderata 28:43 Model belief over latent task variables POMDP for unobserved state 33:49 Posterior sampling in action 35:07 Meta-RL with task-belief states 38:18 Encoder design 43:45 Integrating task-belief with SAC 46:23 Separate task-Inference and RL data 52:16 Limits of posterior sampling 55:06 Summary