Programmable data quality with Dataplex and generative AI

Codelab → https://goo.gle/4nInEzb GitHub → https://goo.gle/3W6TKJc Blog Post → https://goo.gle/46GkqFK Manual data quality rule creation is often timecconsuming and inconsistent. In this video, we'll demonstrate a practical, programmatic workflow on Google Cloud to automate data quality rule generation and deployment using Dataplex and generative AI. Reduce manual effort and build more reliable data pipelines. This session provides a robust pattern for building scalable data quality systems, enabling you to manage your data governance policies as code. Here’s what we’ll cover: 1️⃣ Flattening complex data: See how to use Materialized Views in BigQuery to prepare nested data for comprehensive Dataplex profiling. 2️⃣ Automated profiling: Learn to programmatically trigger and manage Dataplex data profile scans using the Python client. 3️⃣ AI powered rule suggestion: Discover how to export profile results and leverage the Gemini CLI to intelligently suggest Dataplex compliant data quality rules in YAML format. 4️⃣ Human-in-the-Loop validation: Understand the critical importance of human review to validate and refine AI generated configurations before deployment. 5️⃣ Deploying quality scans: We walk through deploying these validated rules as automated Dataplex data quality scans. 🔔 Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech Speaker: Hyunuk Lim Products Mentioned: Dataplex Universal Catalog, BigQuery,