Edwin Chen: Why Frontier Labs Are Diverging, RL Environments & Developing Model Taste

Ask Jacob a Question: Question Form: https://docs.google.com/forms/d/1vHBY... Edwin Chen is the founder and CEO of Surge AI, the data infrastructure company behind nearly every major frontier model. Surge works with OpenAI, Anthropic, Meta, and Google, providing the high-quality data and evaluation infrastructure that powers their models. Edwin reveals why optimizing for popular benchmarks like LMArena is "basically optimizing for clickbait," how one frontier lab's models regressed for 6-12 months without anyone knowing, and why the industry's approach to measurement is fundamentally broken. Jacob and Edwin discuss what actually makes elite AI evaluators, why "there's never going to be a one size fits all solution" for AI models, and how frontier labs are taking surprisingly divergent paths to AGI. 0:00 Intro 0:56 The Pitfalls of Optimizing for LMArena 4:34 Issues with Data Quality and Measurement 9:44 The Importance of Human Evaluations 13:40 The Rise of RL Environments 17:21 Challenges and Lessons in Model Training 19:59 Silicon Valley's Pivot Culture 23:06 Technology-Driven Approach 24:18 Quality Beyond Credentials 27:51 Impact of Scale Acquisition 28:35 Hiring for Research Culture 30:48 Divergence in AI Training Paradigms 34:16 Future of AI Models 39:32 Multimodal AI and Quality 43:44 Quickfire With your co-hosts: @jacobeffron Partner at Redpoint, Former PM Flatiron Health @patrickachase Partner at Redpoint, Former ML Engineer LinkedIn @ericabrescia Former COO Github, Founder Bitnami (acq’d by VMWare) @jordan_segall Partner at Redpoint

Gemini Co-Lead on World Models, RL's Next Domains & Continual Learning

Gemini Co-Lead on World Models, RL's Next Domains & Continual Learning

How to sell RL envs and data to AI labs: Interview with Sean Cai

How to sell RL envs and data to AI labs: Interview with Sean Cai

Building Surge AI to $1 Billion with Edwin Chen

Building Surge AI to $1 Billion with Edwin Chen

Tri Dao: The End of Nvidia's Dominance, Why Inference Costs Fell & The Next 10X in Speed

Tri Dao: The End of Nvidia's Dominance, Why Inference Costs Fell & The Next 10X in Speed

"It's cognitive uploading" | How Google NotebookLM's Steven Johnson uses AI as a second brain

"It's cognitive uploading" | How Google NotebookLM's Steven Johnson uses AI as a second brain

RL Environments at Scale – Will Brown, Prime Intellect

RL Environments at Scale – Will Brown, Prime Intellect

AI Research Legend’s Honest Assessment of Where We Are

AI Research Legend’s Honest Assessment of Where We Are

Yann LeCun on What Comes After LLMs

Yann LeCun on What Comes After LLMs

Inside YC's AI Playbook

Inside YC's AI Playbook

Building the GitHub for RL Environments: Prime Intellect's Will Brown & Johannes Hagemann

Building the GitHub for RL Environments: Prime Intellect's Will Brown & Johannes Hagemann

The Collaboration that Built Modern AI: Geoff Hinton & Jeff Dean in Conversation with Jordan Jacobs

The Collaboration that Built Modern AI: Geoff Hinton & Jeff Dean in Conversation with Jordan Jacobs

AI Talent Wars, xAI’s $200B Valuation, & Google’s Comeback

AI Talent Wars, xAI’s $200B Valuation, & Google’s Comeback

John Schulman on dead ends, scaling RL, and building research institutions

John Schulman on dead ends, scaling RL, and building research institutions

Ex-OpenAI Researcher On Why He Left, His Honest AGI Timeline, & The Limits of Scaling RL

Ex-OpenAI Researcher On Why He Left, His Honest AGI Timeline, & The Limits of Scaling RL

Why the AI Boom Is Just Getting Started

Why the AI Boom Is Just Getting Started

Inference, Diffusion, World Models, and More | YC Paper Club

Inference, Diffusion, World Models, and More | YC Paper Club

How SpaceX Humiliated Wall Street

How SpaceX Humiliated Wall Street

No Priors Ep. 124 | With SurgeAI Founder and CEO Edwin Chen

No Priors Ep. 124 | With SurgeAI Founder and CEO Edwin Chen

The Startup Powering The Data Behind AGI

The Startup Powering The Data Behind AGI

The AI Frontier: from Gemini 3 Deep Think distilling to Flash — Jeff Dean

The AI Frontier: from Gemini 3 Deep Think distilling to Flash — Jeff Dean