AI Evaluation in the wild: from RAG pipelines to agents, By Raul Singh

AI Evaluation in the wild: dalle pipeline RAG agli agenti, di Raul Singh, AI R&D Engineer a Datapizza 📊 L'evaluation è una delle sfide più critiche e sottovalutate nello sviluppo di sistemi GenAI. Esploriamo come affrontarla e le metriche, partendo da pipeline AI come RAG, dove ogni componente va valutato separatamente e i golden dataset fanno la differenza, fino a sistemi agentici dotati di grande autonomia, dove la valutazione statica non basta e servono interi ambienti simulati. Guarda il suo talk a Py4AI 2026, la conferenza internazionale dove appassionati di Python e Intelligenza Artificiale si incontrano e scoprono le ultime innovazioni del settore! 13 giugno 2026 | Pavia, Italia

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026
▶︎

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Special Episode with an Anthropic Researcher [Gian Segato]
▶︎

Special Episode with an Anthropic Researcher [Gian Segato]

[1hr Talk] Intro to Large Language Models
▶︎

[1hr Talk] Intro to Large Language Models

AGENTI AI SU CLAUDE CODE CORSO COMPLETO (2026): da Principiante a Pro in 3 ore
▶︎

AGENTI AI SU CLAUDE CODE CORSO COMPLETO (2026): da Principiante a Pro in 3 ore

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!
▶︎

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI
▶︎

Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI

Salvatore Sanfilippo is revolutionizing AI and I’m hyped (DS4 explained)
▶︎

Salvatore Sanfilippo is revolutionizing AI and I’m hyped (DS4 explained)

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview
▶︎

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Yann LeCun: World Models: Enabling the next AI revolution
▶︎

Yann LeCun: World Models: Enabling the next AI revolution

Santo Rosário | Sexta-feira | 04:00 | 26/06/2026 | Live Ao vivo
▶︎

Santo Rosário | Sexta-feira | 04:00 | 26/06/2026 | Live Ao vivo

What do tech pioneers think about the AI revolution? - The Engineers, BBC World Service
▶︎

What do tech pioneers think about the AI revolution? - The Engineers, BBC World Service

The Hardest Questions in Physics | World Science Festival
▶︎

The Hardest Questions in Physics | World Science Festival

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang
▶︎

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs - Eric Jang

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit
▶︎

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Build a Full-Stack GenAI Project in 4 Hours (FastAPI, React, Supabase)
▶︎

Build a Full-Stack GenAI Project in 4 Hours (FastAPI, React, Supabase)

The World's Most Important Machine
▶︎

The World's Most Important Machine

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup
▶︎

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

Greatest Mysteries of Gravity | Brian Greene & Kip Thorne | World Science Festival
▶︎

Greatest Mysteries of Gravity | Brian Greene & Kip Thorne | World Science Festival

This is not the AI we were promised | The Royal Society
▶︎

This is not the AI we were promised | The Royal Society

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G
▶︎

FULL DISCUSSION: Google's Demis Hassabis, Anthropic's Dario Amodei Debate the World After AGI | AI1G