The Missing Piece in Many Data Pipelines
All my FREE resources: https://www.skool.com/moderndata/about Consulting Services: https://go.kahandatasolutions.com ----- All data teams (large & small) have at least one thing in common. Source data. But not everyone handles it the same way in their pipelines. For some, they'll reference raw source tables directly in many queries. For others, they'll create ad-hoc custom tables to address subtle formatting changes. But without any real over arching strategy or consistent naming behind it. While a more popular topic is data modeling (ex. kimball, one big table, etc.) I believe an equally more important area to consider is what you do BEFORE you start creating those core data models. For many, this "before" layer doesn't exist at all. In previous videos I've talked about a 3-Layered Data Model. And today I want to focus solely on Layer 1, which addresses this concept. It's called a "Staging" layer. When done right, it can help you establish reliable pipelines from the very start. Timestamps: 00:00 - Intro 00:52 - What is a Staging Layer? 03:23 - Reason # 1: Modularity 05:03 - Reason # 2: Consistency 07:21 - Reason #3: Clarity Title & Tags: The Missing Piece in Many Data Pipelines #kahandatasolutions #dataengineering #datamodeling

How Would You Model This Data? (Example)

Learn ETL Pipelines in Databricks in Under 1 Hour | Data Engineering in Databricks

The AI Workflow for Data Engineering

Data Modeling Tutorial: Star Schema (aka Kimball Approach)

Data Warehouse vs Data Lake vs Data Lakehouse | ETL, OLAP vs OLTP

Data Modeling in the Modern Data Stack

Dimensional data modeling and idempotent pipelines in 78 minutes with DataExpert.io

Building ETL Pipelines in Databricks | Data Engineering in Databricks

Designing Data-Intensive Applications: Chapters 1 and 2

Data Warehouse vs Data Lake vs Data Lakehouse

What It Actually Takes to Build a Data Pipeline System From Scratch - And Why You Probably Shouldn't

Creating a Data Model w/ dbt: Dimensions (Part 1/3)

Code along - build an ELT Pipeline in 1 Hour (dbt, Snowflake, Airflow)

Why Dataclasses Disappear in Real Python Applications

Data Architecture 101: The Modern Data Warehouse

Database Normalization for Beginners | How to Normalize Data w/ Power Query (full tutorial!)

Learn Data Modeling in 8 minutes: Dimensional Data Modeling, Data Vault, and One Big Table

Data Pipelines in 8 minutes: Streaming, Batch, and on-demand

