The 4 Things Every QA Engineer Must Test in AI Agents (Live Promptfoo Workshop)AI workshop 1
``` Traditional QA is breaking. If you input X you expect Y — but AI agents don't work that way. In this live, hands-on workshop I show senior QA engineers how to actually test AI agents and chatbots: the 4 things that matter (accuracy, safety, cost, latency), how to build real tests with Promptfoo + the Gemini API, how to catch jailbreaks and hallucinations, and why "90% correct" is the new pass criteria. Open your laptop and build it right alongside me. By the end you'll have a working AI chatbot test suite and a testing strategy that won't blow up your token bill. ⚙️ WHAT YOU'LL LEARN • Why the classic X → Y → Z model fails for AI products • The 4 pillars of testing any AI agent: accuracy, safety, cost, latency • How AI tokens & cost actually work (and why 1,000 random tests is a mistake) • Building a Promptfoo config from scratch with a free Gemini API key • LLM-rubric assertions: politeness, groundedness, hallucination checks • Jailbreak / prompt-injection safety tests (the airline chatbot lawsuit) • Running evals, reading the report, and debugging real failures live • How to prioritise test cases by risk & token cost (reduce the blast radius) • A first look at Mid-scene (vision-based UI testing) + a Playwright AI agent 🔗 TOOLS & LINKS • Promptfoo: https://www.promptfoo.dev • Google AI Studio (free Gemini API key): https://aistudio.google.com/app/apikey • Mid-scene: https://midscenejs.com • Config / repo: https://github.com/JagadeeshJayachand... 🧪 WHO THIS IS FOR QA engineers, SDETs, and test automation folks moving from Selenium/Playwright into testing AI agents, LLM apps, RAG systems and chatbots. 👉 The full Mid-scene + Playwright agent build is happening LIVE in the next session. Hit SUBSCRIBE so you don't miss it, and register here: https://subscribe-forms.beehiiv.com/a... ⏱️ CHAPTERS 0:00 Intro — I ran a live AI workshop for QA engineers 0:35 The 3 AI tools we'll use (Promptfoo, Mid-scene, Playwright agent) 2:12 Why traditional QA testing breaks for AI (X → Y → Z) 5:00 The 4 things you must test: accuracy, safety, cost, latency 6:00 Safety & the airline chatbot refund lawsuit 7:53 How AI tokens & cost actually work 9:55 Latency — response speed 10:20 Recap of the 4 pillars + let's open Promptfoo 11:10 Install Promptfoo + prerequisites (Node) 17:22 Get a free Gemini API key 19:37 Set environment variables (Mac & Windows) 28:00 Build the Promptfoo config (YAML, customer-support agent) 34:11 Test scenarios: accuracy, jailbreaks & hallucinations (LLM rubric) 38:48 Run the eval (+ live debugging) 45:37 Read the report: what passed & what failed 51:12 Why AI tests are about probability (the 90% rule) 54:07 Test strategy: prioritise by risk & token cost 59:21 Promptfoo for red teaming, pen testing & RAG 1:01:07 Mid-scene: vision-based UI testing (live attempt) 1:07:49 Wrap-up & next session #AItesting #QAengineer #Promptfoo #SoftwareTesting #AItools #LLM #TestAutomation #AIagents #Gemini #Chatbot

AI Agents Full Course 2026: Master Agentic AI (2 Hours)

How to Test AI Apps — 4 Things Every QA Engineer Must Check

Cursor 2.0 Tutorial for Beginners (Full Course)

Watch AI Heal 10 Flaky Playwright Tests in Real Time

Time Travel Debugging: The Feature Every QA Engineer Needs

Full Archon Guide - Build AI Coding Harnesses That Actually Ship (LIVE)

Complete Generative AI Course For Free | Gen AI Course 2026 | Intellipaat

CLAUDE COWORK FULL COURSE 3 HOURS For Beginners (2026)

Build a Complete End-to-End GenAI Project in 3 Hours

Claude Code Hooks, Automation, and Event Workflows

ASMR Best Triggers For Sleep Collection (No Talking) 3 Hours of Tapping & Scratching

GOD SAYS;- IT’S TIME I FINALLY TELL YOU THE TRUTH.. | GOD'S MESSAGE FOR YOU TODAY

AI-Assisted Coding Tutorial – OpenClaw, GitHub Copilot, Claude Code, CodeRabbit, Gemini CLI

NestJS Full Course for Beginners in 2026 | Build a Production-Ready API

AI Agents for Beginners – Part 1 (Free Labs)

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

263 DIOS TE DICE HOY: ESA ANGUSTIA QUE TE ROBA LA PAZ SERÁ CAMBIADA POR DESCANSO

Yann LeCun: World Models: Enabling the next AI revolution

Build a Full-Stack GenAI Project in 4 Hours (FastAPI, React, Supabase)

