Mesurer l’immesurable : Comment évaluer les systèmes à base d’IA générative ?
Presentation by: Erin Pacquetet (SCIAM) 📕 Summary: Generative AI is revolutionizing application development, opening up a variety of uses: assistants, content generation, augmented search, and facilitating complex tasks. But a major challenge remains: accurately evaluating products based on models that are both creative and unpredictable. This session explores this paradox: leveraging LLM while controlling the evaluation of its results. We will see how to adjust criteria and methods to assess technical accuracy, consistency, and business relevance. The program includes: limitations of traditional metrics, automated evaluation via "LLM-as-a-judge" (and its biases), the importance of human evaluation, and continuous monitoring to detect deviations and side effects. We will analyze the case of a RAG chatbot, where linguistic creativity and the requirement for truthfulness clash. The evaluation, balancing factuality and fluency, controls accuracy without controlling the question asked. This real-world case study will serve as our guide to implementing a comprehensive and reproducible evaluation pipeline. This session provides benchmarks and tools for methodically evaluating generative systems and leveraging them as a strategic asset in AI. Recorded in April 2026 in Paris, Palais des Congrès, Porte Maillot. 🔥 To stay up-to-date with Devoxx France news, follow us on: LinkedIn: / devoxx-france Bluesky: https://bsky.app/profile/devoxx.fr Visit our website: https://www.devoxx.fr/

Comment ça marche l'IA Générative ? LLM, RAG sous le capot.

Spécialisez vos Agents avec les Skills

2 ans après, les devs n'ont pas disparu : du coup l'IA ca sert à rien ?

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

OCTO Counter - Mastering RAG: Connecting AI Gen Models to Enterprise Data

Agentic development: Stitch and Jules in the antigravity of the Gemini constellation

L'Agentic Coding, nouveau territoire du Platform Engineering

Le Frontend mérite aussi du monitoring !

LLM, RAG et IA agentique : comprendre l'évolution de l'IA

"Perspectives on IA" : conf. de Yann LeCun, WinterWeek – Graduate School – Univ. Gustave Eiffel

Arthur Mensch, co-founder of Mistral AI, is being questioned at the National Assembly - 12/05/2026

€100k and 6 months or €1k and 70 hours: where is the developer profession headed according to Did...

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
![[Leçon inaugurale] Yann Le Cun - Apprentissage profond et au-delà : les nouveaux défis de l'IA](https://i.ytimg.com/vi/Z208NMP7_-0/hqdefault.jpg?sqp=-oaymwE9CNACELwBSFryq4qpAy8IARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYPCBlKEowDw==&rs=AOn4CLCEu0oAHE4bEe4NUpSBvJ-i2cfb_w)
[Leçon inaugurale] Yann Le Cun - Apprentissage profond et au-delà : les nouveaux défis de l'IA

He created an AI to do his job (his boss is hallucinating)

Building AI Agent Systems and Scaling Challenges in Agentic AI
![Intelligence artificielle, bullsh*t, pipotron ? Benjamin Bayart [EN DIRECT]](https://i.ytimg.com/vi/tTb5wQw_8JE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDk6p5INTDW5H77eLA_L7gB0BsW4w)
Intelligence artificielle, bullsh*t, pipotron ? Benjamin Bayart [EN DIRECT]

Kafka 4, fantastique ?

Production Troubleshooting : boostez vos skills, une étude de cas

