Achieving Lakehouse Models with Spark 3.0

It’s very easy to be distracted by the latest and greatest approaches with technology, but sometimes there’s a reason old approaches stand the test of time. Star Schemas & Kimball is one of those things that isn’t going anywhere, but as we move towards the “Data Lakehouse” paradigm – how appropriate is this modelling technique, and how can we harness the Delta Engine & Spark 3.0 to maximise it’s performance? This session looks through the historical problems of attempting to build star-schemas in a lake and steps through a series of technical examples using features such as Delta file formats, Dynamic Partition Pruning and Adaptive Query Execution to tackle these problems. About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business. Read more here: https://databricks.com/product/unifie... See all the previous Summit sessions: Connect with us: Website: https://databricks.com Facebook: / databricksinc Twitter: / databricks LinkedIn: / databricks Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...

Accelerating Data Ingestion with Databricks Autoloader

Accelerating Data Ingestion with Databricks Autoloader

Building a Lakehouse Architecture with Azure Databricks with Christopher Chalcraft

Building a Lakehouse Architecture with Azure Databricks with Christopher Chalcraft

What is Spark? (Visual Explanation)

What is Spark? (Visual Explanation)

|Keynote| Data Modeling in the Era of the Lakehouse

|Keynote| Data Modeling in the Era of the Lakehouse

New Developments in the Open Source Ecosystem: Apache Spark 3 0, Delta Lake, and Koalas

New Developments in the Open Source Ecosystem: Apache Spark 3 0, Delta Lake, and Koalas

The Best Data Warehouse is a Lakehouse

The Best Data Warehouse is a Lakehouse

Delta Lake Streaming: Under the Hood

Delta Lake Streaming: Under the Hood

Data Warehouse vs Data Lake vs Data Lakehouse | ETL, OLAP vs OLTP

Data Warehouse vs Data Lake vs Data Lakehouse | ETL, OLAP vs OLTP

Parquet File Format - Explained to a 5 Year Old!

Parquet File Format - Explained to a 5 Year Old!

Introducing MLflow for End-to-End Machine Learning on Databricks

Introducing MLflow for End-to-End Machine Learning on Databricks

Common mistakes in big data models

Common mistakes in big data models

Behind the Hype: Should you ever build a Data Vault in a Lakehouse?

Behind the Hype: Should you ever build a Data Vault in a Lakehouse?

Databricks, Delta Lake and You

Databricks, Delta Lake and You

What's Wrong with the Medallion Architecture?

What's Wrong with the Medallion Architecture?

Getting Started with Databricks SQL

Getting Started with Databricks SQL

Delta Lake 2.0 Overview

Delta Lake 2.0 Overview

Delta Live Tables: Modern Software Engineering and Management for ETL

Delta Live Tables: Modern Software Engineering and Management for ETL

The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)

The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)

Migrating to Databricks Masterclass: Modernization Must-Haves

Migrating to Databricks Masterclass: Modernization Must-Haves

Data Mesh, Data Fabric, Data Lakehouse - SQLBits 2022

Data Mesh, Data Fabric, Data Lakehouse - SQLBits 2022