AWS re:Invent 2023 - Netflix’s journey to an Apache Iceberg–only data lake (NFX306)

Netflix operates a data lake of approximately one exabyte. Despite this, a portion of data (about 300 petabytes) remained in the legacy Apache Hive table format. Motivated by the well-known benefits Apache Iceberg provides, such as time travel and schema evolution, Netflix fully phased out Hive and transitioned existing data to Iceberg. In this session, learn how Netflix managed this task at the appropriate scale with custom tooling and how they developed unique in-house features like secure Iceberg tables and the Iceberg REST catalog. Learn about Netflix’s journey from a Hive-based to an Iceberg-only data warehouse and how Netflix overcame the challenges that arose with the transition. Learn more about AWS re:Invent at https://go.aws/46iuzGv. Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4 ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster. #AWSreInvent #AWSreInvent2023

Flink and Iceberg: A Powerful Duo for Modern Data Lakes
▶︎

Flink and Iceberg: A Powerful Duo for Modern Data Lakes

Apache Iceberg: What It Is and Why Everyone’s Talking About It.
▶︎

Apache Iceberg: What It Is and Why Everyone’s Talking About It.

AWS re:Invent 2023 - Advanced data modeling with Amazon DynamoDB (DAT410)
▶︎

AWS re:Invent 2023 - Advanced data modeling with Amazon DynamoDB (DAT410)

AWS Tech & AI Essentials — A Free 5-Day Beginner Cohort - Day 4 of 5
▶︎

AWS Tech & AI Essentials — A Free 5-Day Beginner Cohort - Day 4 of 5

Introduction to Data Mesh with Zhamak Dehghani
▶︎

Introduction to Data Mesh with Zhamak Dehghani

AWS re:Invent 2023 - Dive deep into Amazon DynamoDB (DAT330)
▶︎

AWS re:Invent 2023 - Dive deep into Amazon DynamoDB (DAT330)

Apache Iceberg - A Table Format for Huge Analytic Datasets
▶︎

Apache Iceberg - A Table Format for Huge Analytic Datasets

AWS re:Invent 2023 - Meet digital sovereignty needs with AWS Dedicated Local Zones (WPS214)
▶︎

AWS re:Invent 2023 - Meet digital sovereignty needs with AWS Dedicated Local Zones (WPS214)

An Extremely Technical Overview of How Apache Iceberg Planning Actually Works (Russell Spitzer)
▶︎

An Extremely Technical Overview of How Apache Iceberg Planning Actually Works (Russell Spitzer)

AWS re:Invent 2022 - The evolution of chaos engineering at Netflix (NFX303)
▶︎

AWS re:Invent 2022 - The evolution of chaos engineering at Netflix (NFX303)

Apache Iceberg Deep Dive | Part 1 | Crash Course
▶︎

Apache Iceberg Deep Dive | Part 1 | Crash Course

AWS re:Invent 2018: [NEW LAUNCH!] Introducing AWS Transit Gateway (NET331)
▶︎

AWS re:Invent 2018: [NEW LAUNCH!] Introducing AWS Transit Gateway (NET331)

Learn Snowflake in 2 Hours| High Paying Skills | Step by Step For Beginners
▶︎

Learn Snowflake in 2 Hours| High Paying Skills | Step by Step For Beginners

AWS re:Invent 2024 - [NEW LAUNCH] Store tabular data at scale with Amazon S3 Tables (STG367-NEW)
▶︎

AWS re:Invent 2024 - [NEW LAUNCH] Store tabular data at scale with Amazon S3 Tables (STG367-NEW)

AWS re:Invent 2023 - What’s new in AWS Lake Formation (ANT303)
▶︎

AWS re:Invent 2023 - What’s new in AWS Lake Formation (ANT303)

AWS re:Invent 2023 - Data modeling core concepts for Amazon DynamoDB (DAT329)
▶︎

AWS re:Invent 2023 - Data modeling core concepts for Amazon DynamoDB (DAT329)

7 Best Practices for Implementing Apache Iceberg
▶︎

7 Best Practices for Implementing Apache Iceberg

AWS re:Invent 2023 - Optimizing storage price and performance with Amazon S3 (STG211)
▶︎

AWS re:Invent 2023 - Optimizing storage price and performance with Amazon S3 (STG211)

AWS re:Invent 2022 - Building and operating a data lake on Amazon S3 (STG302)
▶︎

AWS re:Invent 2022 - Building and operating a data lake on Amazon S3 (STG302)

What is Apache Iceberg?
▶︎

What is Apache Iceberg?