The Curve That Broke Machine Learning

For decades, machine learning was taught through a single curve. Make a model too simple and it underfits. Make it too complex and it overfits. Somewhere in the middle lies the sweet spot: the bottom of the classical U-shaped bias-variance trade-off. Then modern deep learning crossed the boundary. Large neural networks often have far more parameters than training examples. They can reach zero training error, perfectly interpolating the data, and yet still perform well on unseen examples. According to the old curve, that should have been catastrophic. Instead, researchers found a larger landscape: double descent. This video follows the path from the classical U-curve to the interpolation threshold, through the chaotic peak where test error can spike, and into the second valley of the overparameterized regime. It also explains effective model complexity, why training time and data size can shift the peak, and how this framework connects to delayed generalization phenomena such as grokking. The old map was not wrong. It was just the foothills. 00:00 The Classical U-Shaped Curve 00:45 Overfitting and the Zero-Error Boundary 01:22 Why Modern Neural Networks Should Fail 01:56 The Interpolation Threshold 02:31 Entering the Overparameterized Regime 03:15 Effective Model Complexity 03:57 Data, Training Time, and Grokking 04:38 The Second Valley of Generalization