Edwin Chen: Why Frontier Labs Are Diverging, RL Environments & Developing Model Taste

Ask Jacob a Question: Question Form: https://docs.google.com/forms/d/1vHBY... Edwin Chen is the founder and CEO of Surge AI, the data infrastructure company behind nearly every major frontier model. Surge works with OpenAI, Anthropic, Meta, and Google, providing the high-quality data and evaluation infrastructure that powers their models. Edwin reveals why optimizing for popular benchmarks like LMArena is "basically optimizing for clickbait," how one frontier lab's models regressed for 6-12 months without anyone knowing, and why the industry's approach to measurement is fundamentally broken. Jacob and Edwin discuss what actually makes elite AI evaluators, why "there's never going to be a one size fits all solution" for AI models, and how frontier labs are taking surprisingly divergent paths to AGI. 0:00 Intro 0:56 The Pitfalls of Optimizing for LMArena 4:34 Issues with Data Quality and Measurement 9:44 The Importance of Human Evaluations 13:40 The Rise of RL Environments 17:21 Challenges and Lessons in Model Training 19:59 Silicon Valley's Pivot Culture 23:06 Technology-Driven Approach 24:18 Quality Beyond Credentials 27:51 Impact of Scale Acquisition 28:35 Hiring for Research Culture 30:48 Divergence in AI Training Paradigms 34:16 Future of AI Models 39:32 Multimodal AI and Quality 43:44 Quickfire With your co-hosts:  @jacobeffron  Partner at Redpoint, Former PM Flatiron Health  @patrickachase  Partner at Redpoint, Former ML Engineer LinkedIn  @ericabrescia  Former COO Github, Founder Bitnami (acq’d by VMWare)  @jordan_segall  Partner at Redpoint