A Semantic Cache using LangChain

One common concern of developers building AI applications is how fast answers from LLMs will be served to their end users, as well as how much these answers can be reused. To address this, caches are often used, and Redis is certainly one of the best options to implement this at the speed and scale developers need. However, reusing exact matches from the cache is just part of the history. There is also the need for some special type of cache capable of reusing answers based on the semantic meaning of questions, so different users asking for the same thing can leverage the same response. This is why Redis created a semantic cache: a special type of cache capable of applying vector searches on previously stored answers. In this video, Ricardo Ferreira, Developer Advocate at Redis, shows how to implement a semantic cache using LangChain. He shows how to integrate this cache with a LLM powered by OpenAI to reuse answers stored at Redis. 00:00 What is the use case? 01:45 Setting up Redis 03:50 Redis as standard cache 10:00 Redis as semantic cache 17:00 Deleting the data 🧑🏻‍💻 GitHub repository: ▪️ LangChain apps with Redis: https://github.com/redis-developer/la... 💡 Creating an OpenAI API key: ▪️ https://platform.openai.com/docs/quic...

LLM Session Management with Redis

LLM Session Management with Redis

Make LLM Agents Faster and Cheaper with Semantic Caching & Reranking (Production-Ready Agents #1)

Make LLM Agents Faster and Cheaper with Semantic Caching & Reranking (Production-Ready Agents #1)

Caching Strategies to Slash Your LLM Bill | Prompt & Semantic Caching Explained with Demo

Caching Strategies to Slash Your LLM Bill | Prompt & Semantic Caching Explained with Demo

Is RAG Still Needed? Choosing the Best Approach for LLMs

Is RAG Still Needed? Choosing the Best Approach for LLMs

Stop Confusing LangChain, LangGraph, and LangSmith | Full Breakdown

Stop Confusing LangChain, LangGraph, and LangSmith | Full Breakdown

Optimize RAG Resource Use With Semantic Cache

Optimize RAG Resource Use With Semantic Cache

LangChain vs LangGraph: A Tale of Two Frameworks

LangChain vs LangGraph: A Tale of Two Frameworks

Long-Term Memory with LangGraph

Long-Term Memory with LangGraph

Optimizing RAG with Semantic Caching & LLM Memory - Tyler Hutcherson

Optimizing RAG with Semantic Caching & LLM Memory - Tyler Hutcherson

Short-Term Memory with LangGraph

Short-Term Memory with LangGraph

How to Build Semantic Caching for RAG: Cut LLM Costs by 90% & Boost Performance

How to Build Semantic Caching for RAG: Cut LLM Costs by 90% & Boost Performance

Session 7: RAG Evaluation with RAGAS and How to Improve Retrieval

Session 7: RAG Evaluation with RAGAS and How to Improve Retrieval

Semantic Caching for LLM models

Semantic Caching for LLM models

How AI agents & Claude skills work (Clearly Explained)

How AI agents & Claude skills work (Clearly Explained)

Build with Rowan: The real-time context engine

Build with Rowan: The real-time context engine

Cutting LLM Costs with MongoDB Semantic Caching

Cutting LLM Costs with MongoDB Semantic Caching

RAG vs. CAG: Solving Knowledge Gaps in AI Models

RAG vs. CAG: Solving Knowledge Gaps in AI Models

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

What is a Vector Database? Powering Semantic Search & AI Applications

What is a Vector Database? Powering Semantic Search & AI Applications

Retrieval Augmented Generation (RAG) using Java, LangChain4J and OpenAI

Retrieval Augmented Generation (RAG) using Java, LangChain4J and OpenAI