Repartition vs Coalesce in Apache Spark | Rock the JVM

Written version: https://blog.rockthejvm.com/repartiti... This video is for the Spark programmer who knows the basics and who is ready to dive a little deeper into it. In this video we'll talk about a fundamental Spark distinction between the two ways of redistributing data in between partitions. We'll see how they're similar, how they're different, and we'll do a small performance test, then we'll explain why. This is one of the many techniques we talk about in the Spark Optimization series at Rock the JVM. Follow Rock the JVM on: LinkedIn: / rockthejvm Twitter: / rockthejvm Blog: https://rockthejvm.com/blog ------------------------------------------------------------------------- Home: https://rockthejvm.com -------------------------------------------------------------------------

Broadcast joins in Apache Spark | Rock the JVM

Broadcast joins in Apache Spark | Rock the JVM

Repartition vs Coalesce | Spark Interview questions

Repartition vs Coalesce | Spark Interview questions

Apache Spark Was Hard Until I Learned These 30 Concepts!

Apache Spark Was Hard Until I Learned These 30 Concepts!

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets - Jules Damji

A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets - Jules Damji

How to Read Spark DAGs | Rock the JVM

How to Read Spark DAGs | Rock the JVM

Spark - Coalesce vs Repartition

Spark - Coalesce vs Repartition

Cache, Persist & StorageLevels In Apache Spark

Cache, Persist & StorageLevels In Apache Spark

Eta-expansion and Partially Applied Functions in Scala | Rock the JVM

Eta-expansion and Partially Applied Functions in Scala | Rock the JVM

Spark - Repartition Or Coalesce

Spark - Repartition Or Coalesce

A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)

A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)

Shuffle Partition Spark Optimization: 10x Faster!

Shuffle Partition Spark Optimization: 10x Faster!

I replaced my entire stack with Postgres...

I replaced my entire stack with Postgres...

Partition vs bucketing | Spark and Hive Interview Question

Partition vs bucketing | Spark and Hive Interview Question

ALL the Apache Spark DataFrame Joins | Rock the JVM

ALL the Apache Spark DataFrame Joins | Rock the JVM

How Salting Can Reduce Data Skew By 99%

How Salting Can Reduce Data Skew By 99%

Casey Muratori – The Big OOPs: Anatomy of a Thirty-five-year Mistake – BSC 2025

Casey Muratori – The Big OOPs: Anatomy of a Thirty-five-year Mistake – BSC 2025

22 Optimize Joins in Spark & Understand Bucketing for Faster joins |Sort Merge Join |Broad Cast Join

22 Optimize Joins in Spark & Understand Bucketing for Faster joins |Sort Merge Join |Broad Cast Join

coalesce vs repartition vs partitionBy in spark | Interview question Explained

coalesce vs repartition vs partitionBy in spark | Interview question Explained

SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal

SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal

rdd dataframe and dataset difference || rdd vs dataframe vs dataset in spark || Pyspark video - 8

rdd dataframe and dataset difference || rdd vs dataframe vs dataset in spark || Pyspark video - 8