Managing Apache Airflow at Scale
Session presented by John Jackson at Airflow Summit 2022 In this session we'll be discussing the considerations and challenges when running Apache Airflow at scale. We'll start by defining what it means to run Airflow at scale. Then we'll dive deep into understanding limitations of the Airflow architecture, Scheduler processes, and configuration options. We'll then define scaling workloads via containers and leveraging pools and priority, followed by scaling DAGs via dDynamic DAGs/DAG factories, CI/CD, and DAG access control. Finally we'll get into managing Multiple Airflow Environments, how to split up workloads, and provide central governance for Airflow environment creation and monitoring with an example of Distributing workloads across environments.

How to Run Apache Airflow in Production! Best Practices for Running Apache Airflow at Scale!

Backfill your DAGs in Apache Airflow: Everything you need to know

Amazon Managed Workflows for Apache Airflow at Scale | Serverless Office Hours

The Newcomer's Guide to Airflow's Architecture

Scaling Out Airflow

Bhavani Ravi - Apache Airflow in Production - Bad vs Best Practices

Scalable Data Ingestion Architecture Using Airflow and Spark | Komodo Health

Don't Use Apache Airflow

Best Practices For Writing DAGs In Airflow 2

Apache Spark Architecture - EXPLAINED!

Building a robust data pipeline with the dAG stack dbt, Airflow, Great Expectations

Implementing Event Based DAGs with Airflow

DAG Writing Best Practices in Apache Airflow

How to Use Ray and Apache Airflow for Heavy ML/AI Processing Workloads!

Apache Airflow One Shot- Building End To End ETL Pipeline Using AirFlow And Astro

Airflow DAG Factory: Create DAGs dynamically with YAML

Advanced Data Quality Use Cases with Airflow and Great Expectations

Learn Apache Airflow in 10 Minutes | High-Paying Skills for Data Engineers

Airflow in Practice Stop Worrying Start Loving DAGs - Sarah Schattschneider

