How AI Engineers Improve Agentic Products

Anyone can be a math and science person with Brilliant! Visit https://brilliant.org/AdamLucek/ to start learning and save 20% off an annual premium subscription. Resources: Content Discussed - https://lucek.ai/blogs/llm-evaluations Evaluizer - https://github.com/ALucek/evaluizer LLM Evals FAQ - https://hamel.dev/blog/posts/evals-faq/ Short Musings on AI Engineering and "Failed AI Projects" https://www.sh-reya.com/blog/ai-engin... Product Evals in Three Simple Steps - https://eugeneyan.com/writing/product... An LLM-as-Judge Won't Save The Product—Fixing Your Process Will https://eugeneyan.com/writing/eval-pr... A Field Guide to Rapidly Improving AI Products - https://hamel.dev/blog/posts/field-gu... Who Validates the Validator - https://arxiv.org/pdf/2404.12272 Chapters: 00:00 - Why do we need to improve? 05:20 - Brilliant! 07:13 - Context Continued 09:10 - What Are LLM Evals? 12:11 - Human Feedback 13:48 - Creating the Initial Feedback Set 16:15 - Annotation Part 1 19:56 - Performing Error Analysis 26:14 - LLM-As-A-Judge 27:44 - LLM Judge Pitfalls 29:46 - LLM Judge Alignment 33:11 - Function Evaluations 36:10 - Observability Platforms 39:09 - The Benefits 40:21 - Benefit: Algorithmic Optimization 42:37 - Benefit: Reinforcement Learning 44:28 - Future Checklist 47:10 - Is it Worth It? This video is sponsored by Brilliant #ai #coding #datascience

Don't learn AI Agents without Learning these Fundamentals

Don't learn AI Agents without Learning these Fundamentals

20 AI Concepts Explained in 40 Minutes

20 AI Concepts Explained in 40 Minutes

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

From Retrieval to Navigation: The New RAG Paradigm

From Retrieval to Navigation: The New RAG Paradigm

Using Large Language Models | Build Your Own LLM Workshop #1

Using Large Language Models | Build Your Own LLM Workshop #1

Is RAG Still Needed? Choosing the Best Approach for LLMs

Is RAG Still Needed? Choosing the Best Approach for LLMs

A Guide to Understanding and Creating Agent Evals

A Guide to Understanding and Creating Agent Evals

Andrew Ng: Building Faster with AI

Andrew Ng: Building Faster with AI

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

How AI agents & Claude skills work (Clearly Explained)

How AI agents & Claude skills work (Clearly Explained)

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

Building AI Agent Systems and Scaling Challenges in Agentic AI

Building AI Agent Systems and Scaling Challenges in Agentic AI

From Idea to $650M Exit: Lessons in Building AI Startups

From Idea to $650M Exit: Lessons in Building AI Startups

The Most Famous AI Company Isn't Winning. Here's Who Is.

The Most Famous AI Company Isn't Winning. Here's Who Is.

Software architecture, human judgment, and AI's limits with Grady Booch

Software architecture, human judgment, and AI's limits with Grady Booch

AI Agents Full Course 2026: Master Agentic AI (2 Hours)

AI Agents Full Course 2026: Master Agentic AI (2 Hours)

Reinforcement Learning with Verifiable Rewards - Teaching LLMs to Solve Problems

Reinforcement Learning with Verifiable Rewards - Teaching LLMs to Solve Problems

Why Every AI Developer Should Learn Model Context Protocol (MCP)

Why Every AI Developer Should Learn Model Context Protocol (MCP)

The most rational take on AI you’ll hear this year

The most rational take on AI you’ll hear this year