Intelligence Per Watt with Emilio Andere
On this episode of Alexa’s Input (AI), I sit down with Emilio Andere, co-founder and CEO of Wafer, to talk about the future of AI infrastructure, inference optimization, and the economics driving the AI compute race. We discuss: why “intelligence per watt” may become one of the defining metrics of the AI era the current GPU and accelerator landscape across NVIDIA, AMD, TPUs, and emerging hardware startups why software optimization is becoming just as important as hardware itself inference optimization strategies why AI infrastructure companies are racing up the stack what it’s actually like building an AI infrastructure startup today and more! Emilio also shares lessons from founding Wafer, thoughts on the future of open-source AI infrastructure, and why he believes optimizing intelligence itself could become one of the most important engineering problems. General Podcast Links Watch: / @alexa_griffith Read: https://alexasinput.substack.com/... Listen: https://creators.spotify.com/pod/prof... More: https://linktr.ee/alexagriffith Learn more about the host at Website: https://alexagriffith.com/ LinkedIn: / Find out more about the guest at: LinkedIn: / wafer Website: https://www.wafer.ai/ Wafer AI / Y Combinator Article: https://www.ycombinator.com/companies... Chapters 00:00 Exploring AI Conversations and Recent Podcasts 02:14 Intelligence per Watt: A New Metric for AI 07:35 The Manifesto: Efficiency in Civilization 12:40 Founding Wafer: The Journey Begins 18:08 The GPU Hardware Landscape and Market Dynamics 23:07 AMD's Growing Presence in the GPU Market 24:07 Emerging Competitors in the AI Hardware Space 26:04 Comparing TPUs and GPUs 27:21 Acquisition and Availability of TPUs 28:33 Navigating the GPU Marketplace 30:05 Understanding Neo Cloud Economics 33:30 The AI Bubble Debate 36:25 Optimizing AI Models for Performance 44:46 Bottlenecks in AI Model Performance 48:08 Future Directions in AI Hardware Optimization 54:39 Balancing Speed and Cost in AI Performance 56:54 Kernel Arena: Benchmarking AI Performance 01:03:45 Lessons from Founding: Sales and Emotional Resilience 01:07:38 The Future of AI: Trends and Predictions 01:13:03 Outro Keywords AI hardware, inference optimization, intelligence per watt, GPU market, AI infrastructure, Wafer, AI bubble, TPU, GPU bottleneck, AI efficiency AI optimization, large language models, AI hardware, quantization, speculative decoding, benchmarking, AI infrastructure, model training, AI startups

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

How vLLM and llm-d Changed AI Inference with Rob Shaw

Andrej Karpathy: Software Is Changing (Again)

A rational conversation on where AI is actually going | Benedict Evans

The future of intelligence | Demis Hassabis (Co-founder and CEO of DeepMind)

Zig 2026: No-AI Policy, $670K Foundation, Left GitHub & Why Zig Isn’t 1.0 - Andrew Kelley Explains

Three IPOs That Will Reset Public Markets | The a16z Show

Demis Hassabis: We're Three Quarters of the Way to AGI

Meta Didn’t Lose The Future. It Lost The Plot

Ilya Sutskever – We're moving from the age of scaling to the age of research

Data and Analytics with Bruno Aziza

Why AI Agents are either the best or worst thing we’ve ever built

The Hardest Problem AI Ever Solved, with Google DeepMind CEO

Exclusive Interview With Nvidia CEO Jensen Huang (Full Special)

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G

I Investigated India’s Biggest Smartphone Controversy

Leading in the Age of AI: A Conversation with NVIDIA CEO Jensen Huang | Global Conference 2026

Demis Hassabis On What AI Will Do Next

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

