Principles of Evals: The Future of GenAI Evaluation (E.43)
LLMs are optimized to sound convincing—not to know when they’re wrong. In this episode, Deanna Emery breaks down why hallucinations are fundamentally tied to how language models work, why confidence is often disconnected from correctness, and how better evaluation strategies can make AI systems more reliable in production. We also get into uncertainty, semantic reasoning, and what humans still do better than models. 00:00 — Why LLMs hallucinate confidently 09:00 — The limits of current eval systems 18:00 — Why uncertainty matters in AI 27:00 — Semantic reasoning vs memorization 38:00 — What humans still do better than models The biggest risk in AI isn’t wrong answers. It’s wrong answers delivered with confidence.

▶︎
Diverse Hiring for AI Skills (E.27)

▶︎
How to Prevent Doomsday: Guardrails, Alignment, and Education (E.40)

▶︎
The Riskiest Moment of the AI Bubble

▶︎
Trump Preps for 80th Birthday, Threatens to Hit Iran, Knicks Historic Win & Elon Musk Trillionaire!?

▶︎
Data Planet | Navigating SEO Challenges: Boutique Strategies with Dave Estey

▶︎
Is the AI Boom About to COLLAPSE?

▶︎
LIVE: Conan O’Brien speaks at Harvard graduation ceremony (full)

▶︎
How AI Cracked the Protein Folding Code and Won a Nobel Prize

▶︎
Training Sand to Think: Artificial General Intelligence & Future of Physics

▶︎
Data Poisoning - The Hidden Risk Shaping AI

▶︎
From Idea to $650M Exit: Lessons in Building AI Startups

▶︎
Something is jamming GPS over Europe. Here's what we found

▶︎
Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

▶︎
Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

▶︎
The strategic value of internal development teams

▶︎
Data Planet | Navigating Data Strategy: Insights and Innovations with John Wessell

▶︎
RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

▶︎
Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

▶︎
"A.I. and Our Economic Future," Professor Chad Jones
![Yann LeCun's $1B Bet Against LLMs [Part 1]](https://i.ytimg.com/vi/kYkIdXwW2AE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDbV4izF3i-wxevCVIn7FJjoy1vlA)
▶︎
