Wasserstein Distance & Optimal Transport — Fully Explained
Please consider supporting us on Patreon if you enjoy our content: / thesyntheticmind What's the best way to measure the distance between two probability distributions? This video covers optimal transport theory from the ground up. We start with the intuition of moving piles of sand, then formalize it with Monge's transport maps and Kantorovich's relaxation. Finally, we arrive at the Wasserstein distance — a true geometric metric on the space of distributions. Topics covered: • The sand-moving intuition • Monge's deterministic transport maps • Why Monge's problem sometimes has no solution • Kantorovich's probabilistic transport plans • The linear programming structure • The Wasserstein distance and its properties No prerequisites beyond basic calculus and probability. A complete visual guide to optimal transport and the Wasserstein distance. From moving sand piles to rigorous mathematics — learn how to measure the "effort" needed to transform one distribution into another. We cover Monge's original formulation, Kantorovich's elegant relaxation, and why the Wasserstein distance has become essential in machine learning and beyond. Timestamps in comments.

Marco Cuturi - A Primer on Optimal Transport Part 1

Shape Analysis (Lecture 19): Optimal transport

What does Riemann Zeta have to do with Brownian Motion?

The Key Equation Behind Probability

Soheil Kolouri - Wasserstein Embeddings in the Deep Learning Era

Itô integrals: an Intuitive Introduction | Stochastic Calculus ep.1

Optimal Transport and Information Geometry for Machine Learning and Data Science

What is Lie theory? - The Language of Symmetry

Markov Chains Explained Visually

What are Sigma-Algebras? And Why Do We Need Them?

Optimal Transport - Introduction to Optimal Transport

Oilfield Units: a Measurement System so Cursed it made me Change Career

Why are Transformers replacing CNNs?

Markov Chain Monte Carlo Explained in 10 Minutes

Fantastic KL Divergence and How to (Actually) Compute It

Can optimal transport unify physics and machine learning?

A Simple yet Powerful Math Trick

Why Do Determinants Count Trees?

What Textbooks Don't Tell You About Curve Fitting

