Datalake Rock Paper Scissors: Iceberg + Flink or Iceberg + Spark? | Current 2023
Bloomberg uses Apache Kafka® and Apache Iceberg® as core elements in their real-time data pipelines and storage sinks. In this talk, Sitarama Chekuri and Ben de Vera share their lessons learned testing both Apache Flink® and Apache Spark® to ingest data from Kafka into their Iceberg datalake at near-real-time speeds. They compare and contrast the two technologies with regard to functionality, performance, fault-tolerance, scaling, and resource utilization. CHAPTERS 00:00 - Intro 01:06 - Context on Bloomberg and speakers 03:14 - Motivation 05:41 - Technology overview 16:34 - Performance comparison 32:13 - Scale to multiple applications 35:10 - Summary Speakers: Sitarama Chekuri and Ben de Vera -- ABOUT CONFLUENT Confluent is pioneering a fundamentally new category of data infrastructure focused on data in motion. Confluent’s cloud-native offering is the foundational platform for data in motion – designed to be the intelligent connective tissue enabling real-time data, from multiple sources, to constantly stream across the organization. With Confluent, organizations can meet the new business imperative of delivering rich, digital front-end customer experiences and transitioning to sophisticated, real-time, software-driven backend operations. To learn more, please visit www.confluent.io. #current2023 #apachekafka #kafka #confluent

Apache Spark Vs Apache Flink – Looking Through How Different Companies Approach Spark And Flink

Is Flink the answer to the ETL problem? (with Robert Metzger)

Apache Iceberg: What It Is and Why Everyone’s Talking About It.

Extreme Modelling Patterns • Alberto Brandolini • Devoxx Poland 2024

Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

Iceberg with Flink at DoorDash

Unlocking the Power of Apache Flink: An Introduction in 4 Acts

Introduction to Stateful Stream Processing with Apache Flink • Robert Metzger • GOTO 2019

7 Best Practices for Implementing Apache Iceberg

Apache Iceberg Deep Dive | Part 1 | Crash Course

Building an ingestion architecture for Apache Iceberg

AWS re:Invent 2023 - Netflix’s journey to an Apache Iceberg–only data lake (NFX306)

What is Apache Iceberg?

Time-State Analytics

Massive Scale Data Processing at Netflix using Flink - Snehal Nagmote & Pallavi Phadnis

Something is jamming GPS over Europe. Here's what we found

Change Data Streaming Patterns With Debezium & Apache Flink | Decodable

Tableflow: Materialize Apache Kafka® Topics as Apache Iceberg™ and Delta Lake Tables With Zero ETL

Making Apache Spark™ Better with Delta Lake

