Kevin Liang – Performance Tuning Apache Solr for Dense Vectors #bbuzz
More: https://2025.berlinbuzzwords.de/sessi... Speaker: Kevin Liang While powerful, dense vector search is not a plug-and-play feature that will scale straight out-of-the-box, particularly when it comes to extracting the maximum performance from limited compute resources. Come learn how we tuned dense vector indexes for our 100M+ document dataset, and drastically sped up our queries. With the recent boom in AI, many organizations are in the process of building semantic search stacks from scratch powered by Apache Solr and dense vectors. What many quickly learn when dealing with dense vectors is just how heavy the compute requirements are for vector search compared with lexical search. If not well-tuned, vector search query latency can quickly skyrocket, even with an otherwise reasonably sized dataset. We experienced this pain firsthand when we started vectorizing a 100M+ document dataset. While one can certainly approach this problem head-on by throwing hardware resources at it, this is neither a cheap nor fully-effective solution. This talk will cover a brief introduction to how Apache Solr/Lucene builds dense vector indexes, the journey of how we optimized our dense vector setup, as well as highlight the pitfalls/best practices we learned. Whether you’re a company building out full RAG pipelines or an enthusiast playing around with a novel alternative to standard lexical search, you’re going to want to squeeze the most performance out of your limited compute resources. Let us help you hit the ground running. ### Follow us on Social Media and join the Community! Mastodon: https://floss.social/@BerlinBuzzwords LinkedIn: / berlin-buzzwords Website: https://berlinbuzzwords.de Mail: [email protected] Berlin Buzzwords is an event by Plain Schwarz – https://plainschwarz.com • Code: PVOCJJYPJAOBRUIE

Alessandro Benedetti – End-to-End Semantic Search with Apache Solr 9.8 LLM Module #bbuzz

Filip Makraduli – One GPU, Four Retrieval Modes: Multi-Model Search Serving #bbuzz

How RAG, GraphRAG, and Context Engineering Improve AI Performance

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

What Nobody Tells You About Being a Quant

Rahul Goswami – Zero downtime index upgrade in Apache Solr #bbuzz

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

Is RAG Still Needed? Choosing the Best Approach for LLMs

Yann LeCun: World Models: Enabling the next AI revolution

Apache Solr For Beginners

Matthias Niehoff – DuckDB beyond the notebook #bbuzz

MCP vs API: Simplifying AI Agent Integration with External Data

Apache Solr vs Elasticsearch Differences | How to Choose Your Open Source Search Engine - Sematext

Black Art Slideshow - African Art Gallery For your TV

What is a Vector Database? Powering Semantic Search & AI Applications

Varant Zanoyan – Real-Time ML Pipelines: Feature Chaining with Chronon #bbuzz

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

The Local AI Hardware Mistake Everyone Makes

Apache Solr 8 - Getting Started Tutorial

