L4 TRPO and PPO (Foundations of Deep RL Series)

Lecture 4 of a 6-lecture series on the Foundations of Deep RL Topic: Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) Instructor: Pieter Abbeel Slides: https://www.dropbox.com/s/bodgpysmm6l...