L4 TRPO and PPO (Foundations of Deep RL Series)
Lecture 4 of a 6-lecture series on the Foundations of Deep RL Topic: Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) Instructor: Pieter Abbeel Slides: https://www.dropbox.com/s/bodgpysmm6l...

▶︎
L5 DDPG and SAC (Foundations of Deep RL Series)

▶︎
L1 MDPs, Exact Solution Methods, Max-ent RL (Foundations of Deep RL Series)

▶︎
Trump’s Childish Behavior with World Leaders, Republicans Bash His Iran Deal & Guillermo’s Huge News

▶︎
What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

▶︎
L3 Policy Gradients and Advantage Estimation (Foundations of Deep RL Series)

▶︎
How AI Cracked the Protein Folding Code and Won a Nobel Prize
![How DeepSeek Rewrote the Transformer [MLA]](https://i.ytimg.com/vi/0VLAoVGf_74/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLCSwSaI6q3w2_zizcjVK5wONqMqIQ)
▶︎
How DeepSeek Rewrote the Transformer [MLA]

▶︎
The FASTEST introduction to Reinforcement Learning on the internet

▶︎
L2 Deep Q-Learning (Foundations of Deep RL Series)

▶︎
Zig says NO to AI

▶︎
Proximal Policy Optimization | ChatGPT uses this

▶︎
An introduction to Policy Gradient methods - Deep Reinforcement Learning

▶︎
Proximal Policy Optimization (PPO) for LLMs Explained Intuitively

▶︎
Model Based RL Finally Works!

▶︎
L6 Model-based RL (Foundations of Deep RL Series)

▶︎
Part 1 of 3 — Proximal Policy Optimization Implementation: 11 Core Implementation Details

▶︎
Proximal Policy Optimization (PPO) is Easy With PyTorch | Full PPO Tutorial

▶︎
Proximal Policy Optimization Explained

▶︎
Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

▶︎
