How to Stop Your AI from Making Things Up (RAG)

Every LLM hallucinates. ChatGPT, Claude, Gemini , doesn't matter how big the model. Ask it about your private docs or your company's policies and it will confidently invent an answer that sounds correct and is completely wrong. The fix isn't a smarter model. The fix is RAG , the system that forces your LLM to retrieve real context before it generates a single word. This session is the full breakdown. This is Week 7 of TAI's 12-week AI Engineering cohort. Our instructor Dr Akshika (Aki) Wijesundara walks through the full RAG mental model: chunking strategies, retrieval, re-ranking, the live build, and the eval metrics nobody talks about. By the end you'll know exactly what every part of a RAG pipeline does and which decisions actually move the needle. πŸŽ“ JOIN THE NEXT COHORT β†’ https://theaiinternship.com/ β€’ 12 weeks, live mentor-led sessions β€’ Build real projects: RAG, agents, fine-tuning, MCP, capstone β€’ Small batches, direct access to instructors β€’ Career support + portfolio reviews ━━━━━━━━━━━━━━━━━━━━━ What you'll understand in 47 minutes: βœ“ Why every LLM hallucinates (and why fine-tuning doesn't fix it) βœ“ The full RAG pipeline: chunk β†’ embed β†’ store β†’ retrieve β†’ re-rank β†’ generate βœ“ Five chunking strategies and when to pick each (fixed / document / semantic / recursive / agentic) βœ“ Pre-chunking vs post-chunking β€” and why it matters βœ“ Live demo: building a working RAG pipeline from scratch βœ“ The eval metrics most tutorials skip (hit@K, NDCG, BLEU, exact match) βœ“ How to safeguard your LLM from bad queries βœ“ Tracking + observability with Langfuse πŸ• CHAPTERS 0:00 Welcome β€” what is RAG? 2:33 RAG = Retrieval Augmented Generation 3:03 Why LLMs hallucinate without context 4:42 The RAG workflow, high-level 8:51 The full RAG pipeline 9:11 Chunking β€” your first big decision 13:44 Three chunking strategies (fixed / document / semantic) 17:06 Pre-chunking vs post-chunking 18:21 Recursive + hierarchical chunking 19:52 LLM-based + agentic chunking 22:23 Parent + child chunking 24:14 Retrieval β€” the second pillar 26:03 Live demo β€” building a RAG pipeline 28:46 Generating embeddings 29:55 Creating the vector database 33:48 Retrieval step deep-dive 35:51 The RAG pipeline object 37:59 Why RAG is hard β€” it's probabilistic 38:40 RAG evaluation: hit@K, NDCG, BLEU 43:09 Safeguarding your LLM from bad queries 44:19 This week's homework 45:36 Q&A β€” Langfuse + tracking 46:53 Wrap πŸ› οΈ TOOLS MENTIONED β€’ Weaviate β€” https://weaviate.io β€’ Pinecone β€” https://pinecone.io β€’ Langfuse (tracking + eval) β€” https://langfuse.com β€’ LangChain RAG β€” https://python.langchain.com/docs/tut... β€’ sentence-transformers β€” https://www.sbert.net β€’ OpenAI Embeddings β€” https://platform.openai.com/docs/guid... ━━━━━━━━━━━━━━━━━━━━━ πŸ“š THE FULL CURRICULUM W01 β€” Environment Setup & Your First OpenAI Call W02 β€” Build a ChatGPT Clone (Two Ways) W03 β€” REST APIs, JWT & FastAPI (Unhackable Backend) W04 β€” Vector Embeddings & Semantic Search W05 β€” Fine-Tuning ChatGPT (Turn It Into Your Company Intern) W08 β€” LangChain & LangGraph W09 β€” Build an MCP Server W11–W12 β€” Capstone project + showcase β†’ Apply: https://theaiinternship.com/ About TAI β€” The AI Internship We train engineers to ship real AI products. 12-week mentor-led cohorts, real codebases, real deployment, real career outcomes. #RAG #AIHallucination #LLM #RetrievalAugmentedGeneration #AIEngineering #VectorDatabase #ChatGPT #TheAIInternship #BuildWithAI #PromptEngineering