From Streaming to Search: How Exa Uses Lance and Apache Spark for High-Throughput AI Workloads
AI-native applications require data systems that combine large-scale distributed processing with fast multimodal access. This talk explores how Exa uses Lance and Spark Structured Streaming to power search and AI workloads. We introduce Lance, an open lakehouse format optimized for vectors, multimodal data, and fast random access, and show how its Spark connector enables scalable ETL, streaming ingestion, analytics, and native vector and full-text search. We then present Exa’s streaming architecture for processing large volumes of crawled web data. Using Spark Structured Streaming with Delta and writing enriched outputs into Lance, the pipeline performs local and global deduplication, generates embeddings, and sustains ~10k rows per second into Lance tables that power downstream vector search databases. We conclude with lessons learned integrating Lance and Spark to unify analytics, training, and semantic retrieval within an open architecture. Talk By: Jack Ye, Software Engineer, LanceDB ; Jan van der Vegt, ML Engineer, Exa AI ; Connect with us: Website: https://databricks.com X: / databricks LinkedIn: / databricks Instagram: / databricksinc Facebook: / databricksinc

MIT Just Revealed the AI Bubble's Fatal Flaw

Don't learn AI Agents without Learning these Fundamentals

Android 17 sucks. So I put Linux on a phone.

What is Databricks? The Story Behind the Modern Data Platform (Visual Explanation)

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Yann LeCun: World Models: Enabling the next AI revolution

Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

40Hz Binaural Gamma Waves - Ultra Deep Concentration

How to Build Systems to Actually Achieve Your Goals

Billionaire's WARNING: I'm SELLING. The Crash Is Already Here!

RAG's Evolution: From Simple Retrieval to Agentic AI

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Machine Learning for Everybody – Full Course
![Yann LeCun's $1B Bet Against LLMs [Part 1]](https://i.ytimg.com/vi/kYkIdXwW2AE/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDbV4izF3i-wxevCVIn7FJjoy1vlA)
Yann LeCun's $1B Bet Against LLMs [Part 1]

Instant Focus Mode – 40Hz Gamma Brainwave Music for Deep Focus & Productivity

OWASP's Top 10 Ways to Attack LLMs: AI Vulnerabilities Exposed

How Senior Engineers Actually Build With AI in 2026 | Build a Full Stack Systems Architecture App

Something is jamming GPS over Europe. Here's what we found
![PINK & ORANGE GRADIENT IN HD [3 HOURS]](https://i.ytimg.com/vi/6ih8zppfQSQ/hqdefault.jpg?sqp=-oaymwE9CNACELwBSFryq4qpAy8IARUAAAAAGAElAADIQj0AgKJDeAHwAQH4Af4JgALQBYoCDAgAEAEYfyAsKBMwDw==&rs=AOn4CLDvw6mQM98bfl572zfE7r4GdUG8dg)
PINK & ORANGE GRADIENT IN HD [3 HOURS]

