From Query Plan to Performance: Supercharging your Apache Spark Queries using the Spark UI SQL Tab

The SQL tab in the Spark UI provides a lot of information for analysing your spark queries, ranging from the query plan, to all associated statistics. However, many new Spark practitioners get overwhelmed by the information presented, and have trouble using it to their benefit. In this talk we want to give a gentle introduction to how to read this SQL tab. We will first go over all the common spark operations, such as scans, projects, filter, aggregations and joins; and how they relate to the Spark code written. In the second part of the talk we will show how to read the associated statistics to pinpoint performance bottlenecks. After attending this session you will have a better grasp on query plans and the SQL tab, and will be able to use this knowledge to increase the performance of your spark queries. About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business. Read more here: https://databricks.com/product/unifie... See all the previous Summit sessions: Connect with us: Website: https://databricks.com Facebook:   / databricksinc   Twitter:   / databricks   LinkedIn:   / databricks   Instagram:   / databricksinc   Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets - Jules Damji
▶︎

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets - Jules Damji

Accelerating Data Ingestion with Databricks Autoloader
▶︎

Accelerating Data Ingestion with Databricks Autoloader

Beginner to T-SQL [Full Course]
▶︎

Beginner to T-SQL [Full Course]

Making Apache Spark™ Better with Delta Lake
▶︎

Making Apache Spark™ Better with Delta Lake

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source
▶︎

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)
▶︎

The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)

Apache Spark Architecture - EXPLAINED!
▶︎

Apache Spark Architecture - EXPLAINED!

Apache Spark Core—Deep Dive—Proper Optimization Daniel Tomes Databricks
▶︎

Apache Spark Core—Deep Dive—Proper Optimization Daniel Tomes Databricks

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker
▶︎

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Common Strategies for Improving Performance on Your Delta Lakehouse
▶︎

Common Strategies for Improving Performance on Your Delta Lakehouse

Dynamic Partition Pruning in Apache Spark Bogdan Ghit Databricks -Juliusz Sompolski (Databricks)
▶︎

Dynamic Partition Pruning in Apache Spark Bogdan Ghit Databricks -Juliusz Sompolski (Databricks)

The Apache Spark File Format Ecosystem
▶︎

The Apache Spark File Format Ecosystem

Bucketing in Spark SQL 2 3 with Jacek Laskowski
▶︎

Bucketing in Spark SQL 2 3 with Jacek Laskowski

Deep Dive into LLMs like ChatGPT
▶︎

Deep Dive into LLMs like ChatGPT

Data Modeling for Power BI [Full Course] 📊
▶︎

Data Modeling for Power BI [Full Course] 📊

Data Analysis with Python: Part 5 of 6 - Visualization with Matplotlib and Seaborn (Live Course)
▶︎

Data Analysis with Python: Part 5 of 6 - Visualization with Matplotlib and Seaborn (Live Course)

What is SonarQube | Introduction SonarQube | SonarQube Tutorial | SonarQube Basics | Intellipaat
▶︎

What is SonarQube | Introduction SonarQube | SonarQube Tutorial | SonarQube Basics | Intellipaat

Spark + Parquet In Depth: Spark Summit East talk by: Emily Curtin and Robbie Strickland
▶︎

Spark + Parquet In Depth: Spark Summit East talk by: Emily Curtin and Robbie Strickland

Physical Plans in Spark SQL—continues - David Vrba (Socialbakers)
▶︎

Physical Plans in Spark SQL—continues - David Vrba (Socialbakers)

SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal
▶︎

SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal