Data LakeHouse (modern data stack) | Apache Iceberg, S3 Minio, Trino, Spark, PostgreSQL

🚀 In this video, you'll see how to build a real data lake from scratch and understand why a data engineer needs Iceberg, Trino, MinIO, Spark, and PostgreSQL! I'll demonstrate everything using a live project: we'll connect analytics, set up storage in S3, create a metastore, and learn how to write and read data using SQL and PySpark. Links: IT Mentoring/Consulting – https://korsak0v.notion.site/Data-Eng... TG Channel – https://t.me/DataLikeQWERTY Instagram – / i__korsakov Habr – https://habr.com/ru/users/k0rsakov/pu... Project GitHub – https://github.com/k0rsakov/pet_proje... Apache Iceberg Data Engineer Infrastructure – https://habr.com/ru/articles/850674/ 🔻 What awaits you: • What is a Data Lake and why is it needed in 2025 (in simple terms, in a nutshell!) • How is a Data Lake different from a classic one? DWH • What tasks does Trino + Iceberg + S3 + Spark + PostgreSQL solve? • What a modern data engineer's infrastructure looks like (and how to quickly set it up) • How Trino reads data from different sources • How to create tables via SQL and view them in S3 • How metastore works on PostgreSQL and why is it needed • How to fill a Data Lake with external data via Apache Spark • Hands-on: queries, schemas, table creation, reading via Spark and Trino • Tips and life hacks for working with a Data Lake Timecodes: 00:00 – Start 00:23 – What is a Data Lake 02:17 – Infrastructure overview 04:51 – Setting up a connection to the Data Lake 05:51 – Setting up a connection to OLTP 08:29 – First write to Data Lake Iceberg via Trino 13:29 – Writing data to Data Lake Iceberg via Spark (PySpark) 16:43 – Reading data from Data Lake Iceberg via Trino 17:03 – Reading data from Data Lake Iceberg via Spark (PySpark) 17:22 – Summary #DataLake #Trino #Iceberg #S3 #MinIO #Spark #PostgreSQL #DataEngineering #BigData #ETL #SQL 🔥 Don't forget to like, subscribe to the channel, and turn on the bell so you don't miss new videos!

The Complete Trino Course for Data Engineers: A-Z (Theory + Practice)

The Complete Trino Course for Data Engineers: A-Z (Theory + Practice)

Зачем нужны даталейки (Data Lake)

Зачем нужны даталейки (Data Lake)

Merge Two Sorted Lists

Merge Two Sorted Lists

Как на самом деле работает Apache Iceberg / Владимир Озеров

Как на самом деле работает Apache Iceberg / Владимир Озеров

Владимир Озеров — Как работает Apache Iceberg на примере Trino

Владимир Озеров — Как работает Apache Iceberg на примере Trino

Kafka для дата-инженера: Полный разбор на практике с Python, S3 и ClickHouse

Kafka для дата-инженера: Полный разбор на практике с Python, S3 и ClickHouse

TypeScript in Express – TypeScript Tutorial

TypeScript in Express – TypeScript Tutorial

Lawrence Wilkerson: Iran-Krieg spitzt sich zur globalen Krise zu

Lawrence Wilkerson: Iran-Krieg spitzt sich zur globalen Krise zu

The Most Powerful Manifestation Technique ... It Works So Fast It's Scary.

The Most Powerful Manifestation Technique ... It Works So Fast It's Scary.

Programmiert dein Unterbewusstsein im Schlaf neu - Das stärkste Audio für deinen Quantensprung

Programmiert dein Unterbewusstsein im Schlaf neu - Das stärkste Audio für deinen Quantensprung

Анализируем данные с помощью фреймворка Spark

Анализируем данные с помощью фреймворка Spark

Is This Wish Meant to Be Fulfilled? 🧚🤲 Detailed Pick a Card Tarot Reading ✫・

Is This Wish Meant to Be Fulfilled? 🧚🤲 Detailed Pick a Card Tarot Reading ✫・

DuckDB: Ультимативный гайд за 2 часа. Полное погружение

DuckDB: Ультимативный гайд за 2 часа. Полное погружение

DWH, Data Lake и Data Lakehouse: что это такое и в чем разница? // Курс «Data Engineer»

DWH, Data Lake и Data Lakehouse: что это такое и в чем разница? // Курс «Data Engineer»

Зачем Apache Iceberg, если уже есть много других форматов

Зачем Apache Iceberg, если уже есть много других форматов

Свой распределённый S3 на базе MinIO — практический опыт наступания на грабли / Алексей Плетнёв

Свой распределённый S3 на базе MinIO — практический опыт наступания на грабли / Алексей Плетнёв

My Son-In-Law Has No Idea I Own The Company He Works For As CEO. Dad Journey.

My Son-In-Law Has No Idea I Own The Company He Works For As CEO. Dad Journey.

Владимир Озеров — Быстрая обработка данных в Data Lake с помощью Trino

Владимир Озеров — Быстрая обработка данных в Data Lake с помощью Trino

Apache Iceberg: What It Is and Why Everyone’s Talking About It.

Apache Iceberg: What It Is and Why Everyone’s Talking About It.

Meetup "Исполнение запросов Trino и Spark"

Meetup "Исполнение запросов Trino и Spark"