Accelerating Data Ingestion with Databricks Autoloader
Tracking which incoming files have been processed has always required thought and design when implementing an ETL framework. The Autoloader feature of Databricks looks to simplify this, taking away the pain of file watching and queue management. However, there can also be a lot of nuance and complexity in setting up Autoloader and managing the process of ingesting data using it. After implementing an automated data loading process in a major US CPMG, Simon has some lessons to share from the experience. This session will run through the initial setup and configuration of Autoloader in a Microsoft Azure environment, looking at the components used and what is created behind the scenes. We’ll then look at some of the limitations of the feature, before walking through the process of overcoming these limitations. We will build out a practical example that tackles evolving schemas, applying transformations to your stream, extracting telemetry from the process and finally, how to merge the incoming data into a Delta table. After this session you will be better equipped to use Autoloader in a data ingestion platform, simplifying your production workloads and accelerating the time to realise value in your data! Get insights on how to launch a successful lakehouse architecture in Rise of the Data Lakehouse by Bill Inmon, the father of the data warehouse. Download the ebook: https://dbricks.co/3L8PFQL Connect with us: Website: https://databricks.com Facebook: / databricksinc Twitter: / databricks LinkedIn: / databricks Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...

Delta Live Tables: Modern Software Engineering and Management for ETL

Dive Deeper into Data Engineering on Databricks

Intro to Delta Lake

How Databricks Leverages Auto Loader to Ingest Millions of Files an Hour

Delta Lake 2.0 Overview

The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)

Deep-Dive into Delta Lake

Simplify ETL pipelines on the Databricks Lakehouse

A Practitioner's Guide to Unity Catalog—A Technical Deep Dive

Spark Declarative Pipelines (SDP) Explained in Under 20 Minutes

Announcing Delta Live Tables with Demo | Michael Armbrust | Keynote Data + AI Summit NA 2021

Learn ETL Pipelines in Databricks in Under 1 Hour | Data Engineering in Databricks

Azure Data Factory, Azure Databricks, or Azure Synapse Analytics? When to use what.

DLT Overview: Modern Software Engineering for ETL Processing

Getting Started with Databricks SQL

Build Real-Time Applications with Databricks Streaming

Data Warehouse vs Data Lake vs Data Lakehouse | ETL, OLAP vs OLTP

Databricks End-To-End Project 2026 | Zero-To-Hero
![Azure Synapse | Azure Synapse Analytics [Full Course] ☁️](https://i.ytimg.com/vi/lLrjaVdBuM0/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLDZ19d_NBPnaQHu_8FrZHqmuSXfxg)
Azure Synapse | Azure Synapse Analytics [Full Course] ☁️

