Postgres is half as fast in Linux 7.0, and we always knew why

An aws engineer discovered a 50% regression in postgres throughput while testing the new Linux 7.0 kernel. The cause turns out to be massive TLB and page faults exacerbated by Postgres process-based design. In this backend engineering show episode I dive deep into how this was discovered, the root cause and the possible fixes and workarounds. Intermediate and Advanced Backend Engineering Course Bundle https://courses.husseinnasser.com/bundle My Book, Root Cause: Stories and Lessons from Two Decades of Backend Engineering Bugs https://amzn.to/4cKfZhe 0:00 Intro 2:30 The Discovery 6:30 Spinlocks 9:25 Preemption 13:00 Root Cause 17:00 How Postgres Processes exacerbated the problem 22:30 Is the fix easy? 25:50 Summary Stay Awesome, Hussein