NVIDIA CUDA Tutorial 8: Intro to Shared Memory
Wow, this has been a tricky tute. I originally tried to cover much more and added some coding at the end but it was too long to be interesting. Then I chopped the coding to be a separate tute and concentrated on the theory side, it was still way too long. Shared memory is a very intricate topic, it's at the very core of what programming CUDA is all about. I eventually decided that there's no good brushing over this stuff, shared memory deserves more attention. This tutorial is a little intro, it has information on how to allocate shared memory, a little about what shared memory is and an illustration of the dreaded race condition problem that comes about when resources are shared among parallel threads. Next tute we'll look in more detail at the organization of shared memory and how to get the most performance out of it. After that we will be an excellent position to optimize the algorithm we looked at last tute. Sorry in advance if you're one of those folks that likes a bit of code in the tutes. We'll get back to coding but this topic needs a foundation first. Cheers all! Facebook: / 167732956665435

NVIDIA CUDA Tutorial 9: Bank Conflicts

NVIDIA CUDA Tutorial 10: Blocking with Shared Memory

10. CUDA C++ Basics

Intro to CUDA (part 5): Memory Model

NVIDIA CUDA Tutorial 6: An Embarrassingly Parallel Algorithm 1

Memory Hierarchy | GPU Programming | Episode 6

CUDA Part F: Kernel Optimizations: Shared Memory Accesses; Peter Messmer (NVIDIA)

Learning CUDA 10 Programming : Introduction to Shared Memory | packtpub.com

In 2007, Elon Musk Predicted Everything (Rare Lost Interview)

CUDA Crash Course: GPU Performance Optimizations Part 1

GPU L12: Memory

From Scratch: Shared Memory Atomics and Dynamic Allocation in CUDA

CUDA Programming

Introduction to programming in CUDA C

Must Know Technique in GPU Computing | Episode 4: Tiled Matrix Multiplication in CUDA C

China Just Built What TSMC Said Was Impossible

How Huawei Just Built an Impossible Chip

Why AI Can Never Escape Turing's 1936 Proof

Stop Prompting Claude. Use Karpathy's Method Instead.

