Best Practices for Unit Testing PySpark
This talk shows you best practices for unit testing PySpark code. Unit tests help you reduce production bugs and make your codebase easy to refactor. You will learn how to create PySpark unit tests that run locally and in CI via GitHub actions. You will learn best practices for structuring PySpark code so it’s easy to unit test. You’ll also see how to run integration tests with a cluster for staging datasets. Integration tests provide an additional level of safety. Talk By: Matthew Powers, Staff Developer Advocate, Databricks Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI Connect with us: Website: https://databricks.com Twitter: / databricks LinkedIn: / data… Instagram: / databricksinc Facebook: / databricksinc

Delta Live Tables A to Z: Best Practices for Modern Data Pipelines

Developer Best Practices on Databricks: Git, Tests, and Automated Deployment

MSBuild: Change-driven AI agents with Drasi and LangChain

How To Write Unit Tests in Python • Pytest Tutorial

I Think They Are Lying To You

34 Write PySpark Unit Test Cases using PyTest module | Setup PyTest with PySpark

Apache Spark Core – Practical Optimization Daniel Tomes (Databricks)

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

Data + AI Summit Keynote 2026 | Day 1

Apache Spark Core—Deep Dive—Proper Optimization Daniel Tomes Databricks

Learn to Efficiently Test ETL Pipelines

The ONLY PySpark Tutorial You Will Ever Need.

Apache Spark Architecture - EXPLAINED!

Unit testing with Databricks | Jonathan Neo | November 2021

Technical Deep Dive for Practitioners: Databricks Unity Catalog from A-Z

What is Unit Testing? Why YOU Should Learn It + Easy to Understand Examples

Unit Testing in Python with pytest | Getting Started (Part-1)

What is SonarQube | Introduction SonarQube | SonarQube Tutorial | SonarQube Basics | Intellipaat

Learn Practical Techniques for Applying Data Quality in the Lakehouse with Databricks

