Advancing Spark - Understanding Low Shuffle Merge
Back in Databricks Runtime 9.0 we saw the introduction of a preview "Low Shuffle Merge" feature, but it seemed to go fairly unnoticed. In DBR 10.4, it's now enabled by default and a fully GA part of the platform... but what does it actually do? In this video, Simon walks through the theory of low shuffle merge, and what you should expect to see happening to both your runtime executions, but also the data layout before and after the change. Make no mistake, it's a real speed boost to many common patterns, so use it if you can! For more info on Low Shuffle Merge, see the docs over at: https://docs.microsoft.com/en-us/azur... And as always, get in touch with Advancing Analytics if you need help on your Lakehouse journey

▶︎
Pass PROFESSIONAL Databricks Certified Data Engineer Exam

▶︎
Advancing Spark - Databricks Cluster Metrics! No More Ganglia?

▶︎
The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)

▶︎
Shuffle Partition Spark Optimization: 10x Faster!

▶︎
A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)

▶︎
Accelerating Data Ingestion with Databricks Autoloader

▶︎
Advancing Spark - Identity Columns in Delta

▶︎
Apache Spark Architecture - EXPLAINED!

▶︎
Advancing Spark - Delta Live Tables Generally Available!

▶︎
Spark Basics | Shuffling

▶︎
Optimizing MERGE Performance using Liquid Clustering

▶︎
Shuffling: What it is and why it's important

▶︎
Spark performance optimization Part1 | How to do performance optimization in spark

▶︎
The AI Take Over Has Completely Backfired and I Can't Be Happier

▶︎
Making Apache Spark™ Better with Delta Lake

▶︎
Row Context in DAX

▶︎
Advancing Spark - Databricks Delta Change Feed

▶︎
How to use Microsoft Power Query

▶︎
Lessons From the Field: Applying Best Practices to Your Apache Spark Applications - Silvio Fiorito

▶︎
