Stencil computation pattern in GPU programming CUDA
In this video we discuss about stencil computation, which is a commonly used parallel programming pattern. We discuss a GPU implementation in CUDA, observe bottlenecks and iteratively optimise to get the best implementation.

▶︎
The Strange Math That Predicts (Almost) Anything

▶︎
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

▶︎
Parallel Histogram computation on GPUs in CUDA

▶︎
Transformers, the tech behind LLMs | Deep Learning Chapter 5

▶︎
What Nobody Tells You About Being a Quant

▶︎
Co-Creator of Haskell: Functional Programming, Thinking in Types, Useless Languages | Simon Jones

▶︎
How Huawei Just Built an Impossible Chip

▶︎
Parallel histogram computation on GPUs in CUDA (part 2)

▶︎
What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

▶︎
Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

▶︎
Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

▶︎
Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

▶︎
ASMR Best Triggers For Sleep Collection (No Talking) 3 Hours of Tapping & Scratching

▶︎
How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

▶︎
At Thanksgiving, My Sister Discovered I Had $15 Million And My Family Demanded. | Soft Revenge

▶︎
No Celebrity Has ZERO Filter Like Harrison Ford _ and It’s HILARIOUS!

▶︎
The Obsessive Engineering of Precision Linear Motion

▶︎
6. Monte Carlo Simulation

▶︎
GOD SAYS;- IT’S TIME I FINALLY TELL YOU THE TRUTH.. | GOD'S MESSAGE FOR YOU TODAY

▶︎
