Intelligence Per Watt with Emilio Andere

On this episode of Alexa’s Input (AI), I sit down with Emilio Andere, co-founder and CEO of ⁠Wafer,⁠ to talk about the future of AI infrastructure, inference optimization, and the economics driving the AI compute race. We discuss: why “intelligence per watt” may become one of the defining metrics of the AI era the current GPU and accelerator landscape across NVIDIA, AMD, TPUs, and emerging hardware startups why software optimization is becoming just as important as hardware itself inference optimization strategies why AI infrastructure companies are racing up the stack what it’s actually like building an AI infrastructure startup today and more! Emilio also shares lessons from founding Wafer, thoughts on the future of open-source AI infrastructure, and why he believes optimizing intelligence itself could become one of the most important engineering problems. General Podcast Links Watch: ⁠⁠⁠⁠⁠⁠   / @alexa_griffith   Read: ⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠... Listen:⁠⁠ ⁠⁠https://creators.spotify.com/pod/prof... More: ⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠ Learn more about the host at Website: ⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠ LinkedIn: ⁠⁠⁠⁠⁠⁠  / ⁠⁠⁠⁠⁠⁠   Find out more about the guest at: LinkedIn:   / wafer   Website: https://www.wafer.ai/ Wafer AI / Y Combinator Article: https://www.ycombinator.com/companies... Chapters 00:00 Exploring AI Conversations and Recent Podcasts 02:14 Intelligence per Watt: A New Metric for AI 07:35 The Manifesto: Efficiency in Civilization 12:40 Founding Wafer: The Journey Begins 18:08 The GPU Hardware Landscape and Market Dynamics 23:07 AMD's Growing Presence in the GPU Market 24:07 Emerging Competitors in the AI Hardware Space 26:04 Comparing TPUs and GPUs 27:21 Acquisition and Availability of TPUs 28:33 Navigating the GPU Marketplace 30:05 Understanding Neo Cloud Economics 33:30 The AI Bubble Debate 36:25 Optimizing AI Models for Performance 44:46 Bottlenecks in AI Model Performance 48:08 Future Directions in AI Hardware Optimization 54:39 Balancing Speed and Cost in AI Performance 56:54 Kernel Arena: Benchmarking AI Performance 01:03:45 Lessons from Founding: Sales and Emotional Resilience 01:07:38 The Future of AI: Trends and Predictions 01:13:03 Outro Keywords AI hardware, inference optimization, intelligence per watt, GPU market, AI infrastructure, Wafer, AI bubble, TPU, GPU bottleneck, AI efficiency AI optimization, large language models, AI hardware, quantization, speculative decoding, benchmarking, AI infrastructure, model training, AI startups