Fantastic KL Divergence and How to (Actually) Compute It

Kullback–Leibler (KL) divergence measures the difference between two probability distributions. But where does that come from? In this video, we provide an overview of KL divergence and discuss how to develop a practical method for estimating it. 00:00 Introduction 00:52 Surprise (Self-information) 01:55 Entropy 03:24 Cross-entropy 03:42 KL divergence 04:33 Asymmetry in KL divergence 06:34 Computation challenge of KL divergence 07:13 Monte Earlo estimation 09:11 Biased estimator 10:23 Unbiased and low-variance estimator Reference: The low-variance Monte-Carlo estimator discussed in the second half of the video is from John Schulman's blog post. If you want to learn more, definitely check it out for more details! http://joschu.net/blog/kl-approx.html Video made with Manim: https://www.manim.community/