Unlocking Near Real Time Data Replication with CDC, Apache Spark™ Streaming, and Delta Lake
Tune into DoorDash's journey to migrate from a flaky ETL system with 24-hour data delays, to standardizing a CDC streaming pattern across more than 150 databases to produce near real-time data in a scalable, configurable, and reliable manner. During this journey, understand how we use Delta Lake to build a self-serve, read-optimized data lake with data latencies of 15, whilst reducing operational overhead. Furthermore, understand how certain tradeoffs like conceding to a non-real-time system allow for multiple optimizations but still permit for OLTP query use-cases, and the benefits it provides. Talk by: Ivan Peng and Phani Nalluri Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI Connect with us: Website: https://databricks.com Twitter: / databricks LinkedIn: / databricks Instagram: / databricksinc Facebook: / databricksinc

Delta Live Tables: Building Reliable ETL Pipelines with Azure Databricks

Using DMS and DLT for Change Data Capture

Productizing AsyncAPI for Data Replication - CDC - Jeff Pollock

Eliminating Shuffles in Delete Update, and Merge

Spark Declarative Pipelines (SDP) Explained in Under 20 Minutes

Ocean Waves for Deep Sleep LIVE 🌊 Rolling Waves & Dark Screen Reduce Anxiety, Stress & Sleep Aid

Making Apache Spark™ Better with Delta Lake

Send Relational Database CDC Information to Kinesis Data Streams | Amazon Web Services

Designing Structured Streaming Pipelines—How to Architect Things Right - Tathagata Das Databricks

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

I Think They Are Lying To You

Apache Spark Was Hard Until I Learned These 30 Concepts!

Something is jamming GPS over Europe. Here's what we found

Large Scale Lakehouse Implementation Using Structured Streaming

Attribute-Based Access Controls in Unity Catalog - Building a Scalable Access Management Framework

Introducing dbt with Databricks

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

UNITY CATALOG – more than an Introduction #1 #data #datagovernance #unitycatalog #databricks

Real Time Streaming with Azure Databricks and Event Hubs

