Unlocking Near Real Time Data Replication with CDC, Apache Spark™ Streaming, and Delta Lake

Tune into DoorDash's journey to migrate from a flaky ETL system with 24-hour data delays, to standardizing a CDC streaming pattern across more than 150 databases to produce near real-time data in a scalable, configurable, and reliable manner. During this journey, understand how we use Delta Lake to build a self-serve, read-optimized data lake with data latencies of 15, whilst reducing operational overhead. Furthermore, understand how certain tradeoffs like conceding to a non-real-time system allow for multiple optimizations but still permit for OLTP query use-cases, and the benefits it provides. Talk by: Ivan Peng and Phani Nalluri Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI Connect with us: Website: https://databricks.com Twitter:   / databricks   LinkedIn:   / databricks   Instagram:   / databricksinc   Facebook:   / databricksinc  

Delta Live Tables: Building Reliable ETL Pipelines with Azure Databricks
▶︎

Delta Live Tables: Building Reliable ETL Pipelines with Azure Databricks

Using DMS and DLT for Change Data Capture
▶︎

Using DMS and DLT for Change Data Capture

Productizing AsyncAPI for Data Replication - CDC - Jeff Pollock
▶︎

Productizing AsyncAPI for Data Replication - CDC - Jeff Pollock

Eliminating Shuffles in Delete Update, and Merge
▶︎

Eliminating Shuffles in Delete Update, and Merge

Spark Declarative Pipelines (SDP) Explained in Under 20 Minutes
▶︎

Spark Declarative Pipelines (SDP) Explained in Under 20 Minutes

Ocean Waves for Deep Sleep LIVE 🌊 Rolling Waves & Dark Screen  Reduce Anxiety, Stress & Sleep Aid
▶︎

Ocean Waves for Deep Sleep LIVE 🌊 Rolling Waves & Dark Screen Reduce Anxiety, Stress & Sleep Aid

Making Apache Spark™ Better with Delta Lake
▶︎

Making Apache Spark™ Better with Delta Lake

Send Relational Database CDC Information to Kinesis Data Streams | Amazon Web Services
▶︎

Send Relational Database CDC Information to Kinesis Data Streams | Amazon Web Services

Designing Structured Streaming Pipelines—How to Architect Things Right - Tathagata Das Databricks
▶︎

Designing Structured Streaming Pipelines—How to Architect Things Right - Tathagata Das Databricks

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker
▶︎

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

I Think They Are Lying To You
▶︎

I Think They Are Lying To You

Apache Spark Was Hard Until I Learned These 30 Concepts!
▶︎

Apache Spark Was Hard Until I Learned These 30 Concepts!

Something is jamming GPS over Europe. Here's what we found
▶︎

Something is jamming GPS over Europe. Here's what we found

Large Scale Lakehouse Implementation Using Structured Streaming
▶︎

Large Scale Lakehouse Implementation Using Structured Streaming

Attribute-Based Access Controls in Unity Catalog - Building a Scalable Access Management Framework
▶︎

Attribute-Based Access Controls in Unity Catalog - Building a Scalable Access Management Framework

Introducing dbt with Databricks
▶︎

Introducing dbt with Databricks

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra
▶︎

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

UNITY CATALOG – more than an Introduction #1 #data #datagovernance #unitycatalog #databricks
▶︎

UNITY CATALOG – more than an Introduction #1 #data #datagovernance #unitycatalog #databricks

Real Time Streaming with Azure Databricks and Event Hubs
▶︎

Real Time Streaming with Azure Databricks and Event Hubs

Data Warehouse Interview Questions And Answers | Data Warehouse Tutorial | Edureka
▶︎

Data Warehouse Interview Questions And Answers | Data Warehouse Tutorial | Edureka