A Deep Dive into the Catalyst Optimizer (Herman van Hovell)
Catalyst is becoming one of the most important components in Apache Spark, as it underpins all the major new APIs in Spark 2.0, from DataFrames, Datasets, to streaming. At its core, Catalyst is a general library for manipulating trees. Based on this library, we have built a modular compiler frontend for Spark, including a query analyzer, optimizer, and an execution planner. In this talk, I will introduce the core concepts of catalyst by working through a few examples. I will also show how new and upcomming features are implemented using Catalyst. The audience will walk away with a deeper understanding of how Spark analyzes, optimizes and plans a user’s query.

▶︎
A Deep Dive into the Catalyst Optimizer Hands on Lab (Herman van Hovell)

▶︎
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai

▶︎
How to Read Spark DAGs | Rock the JVM

▶︎
Deep Dive: Apache Spark Memory Management

▶︎
Deep Dive Into Catalyst: Apache Spark 2 0'S Optimizer

▶︎
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Michael Armbrust

▶︎
A Deep Dive into Query Execution Engine of Spark SQL - Maryann Xue

▶︎
Broadcast joins in Apache Spark | Rock the JVM

▶︎
Deep Dive into Project Tungsten Bringing Spark Closer to Bare Metal -Josh Rosen (Databricks)

▶︎
Deep Dive into Monitoring Spark Applications Using Web UI and SparkListeners (Jacek Laskowski)

▶︎
Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

▶︎
Casey Muratori – The Big OOPs: Anatomy of a Thirty-five-year Mistake – BSC 2025

▶︎
A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)

▶︎
The Apache Spark™ Cost-Based Optimizer

▶︎
What SpaceX, Anthropic and OpenAI’s IPOs mean for investors

▶︎
SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal

▶︎
Understanding Query Plans and Spark UIs - Xiao Li Databricks

▶︎
Introduction to AmpLab Spark Internals

▶︎
Something is jamming GPS over Europe. Here's what we found

▶︎
