Performance Ep.3: Finding API Limits with k6

In this episode, we establish a baseline for API performance using k6. The goal is simple: understand how many users the system can handle, how it behaves as load increases, and how to recognize the moment when performance starts to degrade. This is not about maximizing numbers. It’s about learning how to reason about load, saturation, and limits before adding observability or optimizations. Topics covered: What k6 is and why it works well for developers Different types of performance testing and when to use them Closed (virtual user) vs open (request rate) load models System saturation and nonlinear performance collapse Applying resource limits to expose bottlenecks Interpreting k6 results to understand system behavior What you’ll learn: How to write basic k6 tests in JavaScript When to use load, spike, stress, and endurance testing How virtual users relate to real system capacity How to define SLA thresholds in k6 How to identify the first limiting factor under load This episode creates the foundation for later ones, where we add Grafana, Prometheus, and deeper observability to explain why the system behaves the way it does. Chapters: 0:00 – The question: how many users can the API handle? 0:16 – What k6 is and why use it 0:36 – Types of performance testing 1:16 – Closed model (virtual users) 2:12 – Open model (request rate) 2:58 – System saturation concepts 3:38 – Resource limiting 4:03 – First k6 test execution 5:40 – Increasing load 6:33 – Key takeaways Resources: GitHub repository: https://github.com/IggyCloud/eShop Performance data: https://github.com/IggyCloud/resources Discord:   / discord   #K6 #PerformanceEngineering #LoadTesting #APITesting #Kubernetes