Breaking the Memory Wall: How New Memory Architectures are Reshaping AI Inference
In this episode of Tech Threads: Weaving the Intelligent Future, Baya Systems’ Nandan Nayampally sits down with Charlie Cheng, founder and CEO of TC Lab, for an in-depth conversation on the memory wall and why it has become one of the defining bottlenecks in AI infrastructure. While memory constraints have existed for decades, AI inference is bringing the issue into sharper focus by turning memory bandwidth into a direct driver of user experience, system performance, and data center economics. Charlie shares his perspective on the industry’s shift toward alternative AI architectures, from high-bandwidth memory and SRAM-based approaches to emerging 3D memory technologies and hybrid-bonded architectures that bring memory much closer to compute. He explains why inference workloads, especially token generation and KV cache access, can quickly become bandwidth-bound, and why solving that challenge requires rethinking the relationship between compute, memory, packaging, and on-chip data movement. The discussion also explores what happens when memory bottlenecks are reduced or removed. As more bandwidth becomes available to AI accelerators, the pressure shifts to the rest of the system, including networks-on-chip, chiplet fabrics, and data movement architectures. For companies building next-generation AI chips, hyperscale infrastructure, autonomous systems, and edge inference platforms, this creates both a challenge and an opportunity: the need for more flexible, scalable, and software-defined approaches to moving data efficiently across increasingly complex systems. Tune in for an expert look at why the future of AI performance depends as much on memory innovation and data movement as it does on compute, and how new architectures could help unlock faster, more efficient, and more scalable AI systems.

Breaking the Memory Wall: How New Memory Architectures are Reshaping AI Inference

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

How To Think SO CLEARLY People Assume You're A Genius

The Future of AI Agents with Andrew Ng | Interrupt 26

Is This the End of Classical AI? Quantum-AI Hybrid Chips Explained

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

The Architecture of "Open" Intelligence

The Insane Complexity of the Semiconductor Global Supply Chain

New Chip Factory That Terrifies TSMC

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

This Chinese Smartphone Company Is Quietly Killing Apple

How AI agents & Claude skills work (Clearly Explained)

Conan O’Brien Mocks Trump At Harvard Commencement | Crowd Erupts During Viral Speech

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

God Says:"TAKE THIS MESSAGE SERIOUSLY, BECAUSE ONLY YOU ARE SEEING IT"/God Message Now/God Message

How AI Is Pushing the Semiconductor Supply Chain to the Limit | Bloomberg Primer

The World's Most Important Machine

