How unsupervised machine learning can scale data quality monitoring in Databricks

Technologies like Databricks Delta Lake and Databricks SQL enable enterprises to store and query their data. But existing rules and metrics approaches to monitoring the quality of this data are tedious to set up and maintain, fail to catch unexpected issues, and generate false positive alerts that lead to alert fatigue. In this talk, Jeremy will describe a set of fully unsupervised machine learning algorithms for monitoring data quality at scale in Databricks. He will cover how the algorithms work, their strengths and weaknesses, and how they are tested and calibrated. Participants will leave this talk with an understanding of unsupervised data quality monitoring, its strengths and weaknesses, and how to begin monitoring data using it in Databricks. Connect with us: Website: https://databricks.com Facebook:   / databricksinc   Twitter:   / databricks   LinkedIn:   / data.  . Instagram:   / databricksinc  

What is Databricks? The Story Behind the Modern Data Platform (Visual Explanation)
▶︎

What is Databricks? The Story Behind the Modern Data Platform (Visual Explanation)

Peter Hahne settles the score: The CDU is destroying the country, the madness of Friedrich Merz a...
▶︎

Peter Hahne settles the score: The CDU is destroying the country, the madness of Friedrich Merz a...

Data Contracts - Accountable Data Quality | Data Quality Camp
▶︎

Data Contracts - Accountable Data Quality | Data Quality Camp

Learn MLOps with MLflow and Databricks – Full Course for Machine Learning Engineers
▶︎

Learn MLOps with MLflow and Databricks – Full Course for Machine Learning Engineers

Architecting for Data Quality in the Lakehouse with Delta Lake and PySpark
▶︎

Architecting for Data Quality in the Lakehouse with Delta Lake and PySpark

Learn to Efficiently Test ETL Pipelines
▶︎

Learn to Efficiently Test ETL Pipelines

Community BrickTalk: Using AI to Navigate Unfamiliar Business Data
▶︎

Community BrickTalk: Using AI to Navigate Unfamiliar Business Data

Data Reliability Engineering: A New Approach to Data Quality | Bigeye
▶︎

Data Reliability Engineering: A New Approach to Data Quality | Bigeye

Anomaly Detection 101 - Elizabeth (Betsy) Nichols Ph.D.
▶︎

Anomaly Detection 101 - Elizabeth (Betsy) Nichols Ph.D.

Implementing an End-to-End Demand Forecasting Solution Through Databricks and MLflow
▶︎

Implementing an End-to-End Demand Forecasting Solution Through Databricks and MLflow

Learn to Use Databricks for Data Science
▶︎

Learn to Use Databricks for Data Science

Data Modeling for Power BI [Full Course] 📊
▶︎

Data Modeling for Power BI [Full Course] 📊

Data Mesh in Practice - Assuring Data Quality at Scale - Gayathri Thiyagarajan - DDD Europe 2022
▶︎

Data Mesh in Practice - Assuring Data Quality at Scale - Gayathri Thiyagarajan - DDD Europe 2022

Backfill Streaming Data Pipelines in Kappa Architecture
▶︎

Backfill Streaming Data Pipelines in Kappa Architecture

Anomaly Detection for Data Quality and Metric Shifts at Netflix | Netflix
▶︎

Anomaly Detection for Data Quality and Metric Shifts at Netflix | Netflix

Learn ETL Pipelines in Databricks in Under 1 Hour | Data Engineering in Databricks
▶︎

Learn ETL Pipelines in Databricks in Under 1 Hour | Data Engineering in Databricks

How And Why Data Engineers Need To Care About Data Quality Now - And How To Implement It
▶︎

How And Why Data Engineers Need To Care About Data Quality Now - And How To Implement It

How to use RStudio | Foundations of data analysis with R (lesson 2)
▶︎

How to use RStudio | Foundations of data analysis with R (lesson 2)

dbt and Databricks: Analytics Engineering on the Lakehouse
▶︎

dbt and Databricks: Analytics Engineering on the Lakehouse

Low-Code Machine Learning on Databricks with AutoML
▶︎

Low-Code Machine Learning on Databricks with AutoML