Interview with NVIDIA Dynamo Architect Kyle Kranen
In this episode, Nader and Carter interview NVIDIA Dynamo architect Kyle Kranen to learn about what Dynamo is and how it can make models like DeepSeek-R1 increase throughput by up to 30x! You have 3 levers when running inference on AI models: quality, cost, speed. For example: reasoning models like DeepSeek-R1 do test-time scaling, where asking the model to think improves quality but reduces speed and increases costs. We dive into how NVIDIA Dynamo gives you the ability to tweak all 3 levers through techniques like disaggregation, kv offloading, and kv routing. Read: https://developer.nvidia.com/blog/int... Follow Kyle ➡️ / kyle-kranen Follow Carter ➡️ / carter-abdallah-958666140 Follow Nader ➡️ / naderlikeladder

▶︎
Beyond the Algorithm with NVIDIA: Introducing NVIDIA Dynamo

▶︎
Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit
![By the [run]Book: Episode 23](https://i.ytimg.com/vi/IMKp2JhCaCU/hqdefault.jpg?sqp=-oaymwEnCNACELwBSFryq4qpAxkIARUAAAAAGAElAADIQj0AgKJDeAG4AvMY&rs=AOn4CLCB6MrhTq5eK7jhhwyrKlgoAmyLZw&usqp=CCY)
▶︎
By the [run]Book: Episode 23

▶︎
NVIDIA Dynamo: High performance Open Source Interface | William Arnold | AER Labs

▶︎
NVIDIA CEO Jensen Huang's Vision for the Future

▶︎
Getting Started with CUDA and Parallel Programming | NVIDIA GTC 2025 Session

▶︎
Robotics' End Game: Nvidia's Jim Fan

▶︎
Leading in the Age of AI: A Conversation with NVIDIA CEO Jensen Huang | Global Conference 2026

▶︎
AI Perf benchmarking - Dynamo and other LLM endpoints

▶︎
START YOUR TUESDAY WITH FAITH | TODAY GOD IS GIVING YOU UNEXPECTED OPPORTUNITIES | FATHER FREDDY ...

▶︎
NVIDIA Triton Inference Server and its use in Netflix's Model Scoring Service

▶︎
Jensen Huang of Nvidia on the Future of A.I. | DealBook Summit 2023

▶︎
Announcing NVIDIA RTX Spark | GTC Taipei 2026 Keynote by CEO Jensen Huang

▶︎
Insights from NVIDIA Research | NVIDIA GTC

▶︎
Stanford CS153 Frontier Systems | Jensen Huang from NVIDIA on the Compute Behind Intelligence

▶︎
Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

▶︎
Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral

▶︎
NVIDIA Dynamo Developer Office Hours

▶︎
Jensen Huang: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

▶︎
