PyTorch Symmetric Memory: A New Programming Paradigm for Distributed AI - Ke Wen & Chien-Chin Huang
PyTorch Symmetric Memory: A New Programming Paradigm for Distributed AI - Ke Wen & Chien-Chin Huang, Meta Recent advancements in models led by DeepSeek have highlighted the need for customized communication. In response, PyTorch introduces Symmetric Memory, a new distributed programming model that creates a global address space for data spanning multiple GPUs' memory. This makes fine-grained GPU-initiated remote access possible. In this talk, we will demonstrate how developers can author their own communication kernels at the device level. Additionally, we will show how to interleave communication and computation within the same kernel using popular languages like Triton, achieving the finest-grained fusion. Furthermore, we will discuss how these capabilities can integrate with the torch.compile ecosystem. We will provide concrete examples based on the all-to-all-v used in MoE models, fused communication + layer norm, and masked-aware communication driven by FlexAttention.

Building Compilers for AI Programming Frameworks | Prof. Uday Reddy Bondhugula | IICT 2024

Generating State-of-the-Art GEMMs for Heterogeneous Hardware with... - Michael Lazos & Henry Tsang

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

BrewSLM Academy · Track 2 — Hands-on: fine-tune a small model by hand

Can Yann LeCun Reshape AI (again)?

Lessons from the Trenches on Building Usable Coding Agents - Graham Neubig

I Hacked This Temu Router. What I Found Should Be Illegal.

This Battery Doesn't Need Lithium and It Just Hit Mass Production

PyTorch in 1 Hour

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

Yann LeCun's $1B Bet Against LLMs

What The Architect Scene in The Matrix ACTUALLY Means

Conan O’Brien Delivers the Commencement Address | Harvard Commencement 2026

Knife Expert: Real Knife Defense Is TERRIFYING

The Agent Development Lifecycle: Build, Test, Deploy, Monitor | Interrupt 26

Don't learn AI Agents without Learning these Fundamentals

The Science and Practice of Open and Scalable LLM Evaluations - Grzegorz Chlebus, NVIDIA

What Is Yann LeCun Cooking? JEPA Explained Simply

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

