Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization - Aliaksei Sala - CppCon
https://cppcon.org --- Matrix Multiplication Deep Dive || Cache Blocking, SIMD & Parallelization - Aliaksei Sala - CppCon 2025 --- Matrix multiplication is a fundamental operation in scientific computing, game development, AI, and numerous high-performance applications. While its mathematical definition is simple, achieving optimal performance in C++ is far from trivial. In this talk, we will explore different optimization techniques for matrix multiplication, from naive implementations to highly tuned versions leveraging modern hardware features. We will cover key performance-enhancing strategies such as loop unrolling, cache blocking, SIMD vectorization, parallelization using threads and more. Through benchmarking and profiling, we will measure the real impact of these optimizations. By the end of this session, attendees will gain insights into two critical questions: How hard is it to implement an optimized matrix multiplication in C++? How effective is C++ for achieving peak performance in this task? This talk is suitable for developers interested in performance optimization, computational efficiency, and modern C++ techniques for numerical computing. --- Slides: https://github.com/CppCon/CppCon2025/... Work at Hudson River Trading (HRT): https://tinyurl.com/safxfctf --- Aliaksei Sala I’m a Lead Software Engineer at EPAM Systems with over 10 years of experience in C++ and high-performance computing. My background spans embedded systems, Linux, and AI acceleration, and I’m currently working with Tenstorrent’s RISC-V–based compute platform. I enjoy digging into performance-critical code, from optimizing matrix multiplication to exploring modern C++ techniques that push hardware efficiency. I’m also active in the C++ community and excited to share my work on performance engineering at CppCon. --- CppCon is the annual, week-long face-to-face gathering for the entire C++ community. The conference is organized by the C++ community for the community. You will enjoy inspirational talks and a friendly atmosphere designed to help attendees learn from each other, meet interesting people, and generally have a stimulating experience. Taking place this year in Aurora, Colorado, near the Denver airport, and including multiple diverse tracks, the conference will appeal to anyone from C++ novices to experts. Annual CppCon Conference - https://www.cppcon.org / cppcon https://x.com/cppcon / cppconference / cppcon https://mastodon.social/@CppCon --- Videos Filmed & Edited by Bash Films: http://www.BashFilms.com YouTube Channel Managed by Digital Medium Ltd: https://events.digital-medium.co.uk --- #cpp #cplusplus #cppcon #cppprogramming #cplusplusprogramming #softwaredevelopment #softwareengineering #coding #code #computerscience #technology #technews #programming #programmer

Cache-Friendly C++ - Jonathan Müller - CppCon 2025

Back to Basics: Custom Allocators Explained - From Basics to Advanced - Kevin Carpenter - CppCon
![[ER2026] Mastering Wakeup Sources in Linux: Architecture, APIs, and Constraints (Kendall Willis)](https://i.ytimg.com/vi/UtoPZMNts_Q/hqdefault.jpg?sqp=-oaymwFBCNACELwBSFryq4qpAzMIARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYZSBlKGUwD7gC8xg=&rs=AOn4CLCJOYwQ0LVqqBrLRfoIVPNj7Y0ngA&usqp=CCY)
[ER2026] Mastering Wakeup Sources in Linux: Architecture, APIs, and Constraints (Kendall Willis)

Choose the Right C++ Parallelism Tool | Low-Level vs Async vs Coroutines vs Data Parallel

Compiler Explorer: The Features You Never Knew Existed - Matt Godbolt - CppCon 2025

BeCPP Symposium 2026 - Herb Sutter - C++ Growing in a world of competition, safety, and AI

Practical Reflection With C++26 - Barry Revzin - CppCon 2025

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

Stop Over-provisioning for Application Startup by Grazino Casto

Introduction to Wait-free Algorithms in C++ Programming - Daniel Anderson - CppCon 2024

First Principles While Designing C++ Applications - Prabhu Missier - CppCon 2025

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Using Floating-point in C++: What Works, What Breaks, and Why - Egor Suvorov - CppCon 2025

Threads vs Coroutines — Why C++ Has Two Concurrency Models - Conor Spilsbury - CppCon 2025

CTRACK: C++ Performance Tracking and Bottleneck Discovery - Grischa Hauser - CppCon 2025

The Cost of Concurrency Coordination with Jon Gjengset

But what is the Fourier Transform? A visual introduction.

Building an AI Dark Factory: A Codebase That Writes Its Own Code, Live

