18. How to Implement Upsert Using Copy Activity & MERGE in Azure Data Factory | Real-Time Project

Welcome to Session 18 of the Azure Data Factory Beginner Course ๐Ÿš€ -------------------------------------------------- ๐Ÿ“บ Azure Data Factory Beginner Playlist: Azure Data Factory โ€“ Beginner (From Scratch) -------------------------------------------------- In this session, you will learn how real-time enterprise projects implement advanced logic in Azure Data Factory using Copy Activity and Stored Procedure (MERGE). Many beginners directly copy data into final SQL tables, which works only for demos โ€” NOT for real projects. This session explains why production-grade pipelines always use a Stage Table and Stored Procedure-based MERGE logic instead of direct Copy. -------------------------------------------------- ๐Ÿ”น Topics Covered in This Video: -------------------------------------------------- Why direct Copy Activity to final table is not recommended Problems with direct copy in real projects What is a Stage Table and why it is required Clear responsibility of Copy Activity (data movement only) Why business logic should NOT be implemented in Copy Activity How Stored Procedure handles business logic How MERGE (UPSERT) works for Insert + Update scenarios Copy โ†’ Stage Table โ†’ Stored Procedure pipeline design Production-ready ADF pipeline pattern Real-time project explanation Interview-ready concepts Key takeaways -------------------------------------------------- Problems with Direct Copy to SQL Table: -------------------------------------------------- Duplicate records No update for existing data No history handling No business rule validation Hard to maintain pipelines Works for demo, fails in production -------------------------------------------------- Why Copy + Stored Procedure MERGE Is Powerful: -------------------------------------------------- Clean separation of concerns Safe insert and update handling Supports complex business logic Easy re-runs and troubleshooting Used in real enterprise projects -------------------------------------------------- Real-Time Use Cases: -------------------------------------------------- Loading data from REST / HTTP APIs Jira API data ingestion ServiceNow API data ingestion Salesforce API ingestion Daily incremental API loads Any API-based ingestion with updates -------------------------------------------------- ๐Ÿ”— Join Our Learning Community: -------------------------------------------------- WhatsApp Channel: https://whatsapp.com/channel/yourlink Telegram Channel: https://t.me/yourlink -------------------------------------------------- ๐ŸŽ“ Cloudpandith Training Institute โ€“ Azure Data Engineering -------------------------------------------------- Learn Azure Data Engineering from real-time industry experts with strong focus on practical implementation and live project scenarios. ๐ŸŒ Course Details & Syllabus: https://www.cloudpandith.com ๐Ÿงช Azure Data Engineer Demo: Every Saturday | 10:00 AM IST ๐Ÿ“ฒ WhatsApp for Course Details: https://wa.me/918904424822 ๐Ÿ“… Book 1-to-1 Discussion with Trainer: https://meetpro.club/mallaiahsomula ๐Ÿ“ About the Trainer: https://goo.gl/maps/9jGub6NfLH2jmVeGA ๐Ÿ“ž Contact: 8904424822 ------------------------------------ ๐Ÿ“š Free Practice Resources (Optional) ------------------------------------ MCQs are discussed in the video. If you want MCQs in document format (FREE), you may optionally join: WhatsApp Group: https://chat.whatsapp.com/Gow6Qxi77FN... Telegram Group: https://t.me/+PGnxzlaFpvo1MWJl Joining groups is optional. Everything required to understand Copy Activity + MERGE logic is already covered in the video. ๐Ÿ‘ Like | Share | Subscribe for more Azure Data Factory & Data Engineering sessions -------------------------------------------------- What You Will Learn Next (Session 19): -------------------------------------------------- Incremental Load using Watermark + MERGE in Azure Data Factory (Real-Time Pattern) CREATE TABLE [dbo].[orders]( [order_id] [bigint] NULL, [customer_id] [varchar](255) NULL, [customer_name] [varchar](255) NULL, [order_date] [varchar](255) NULL, [updated_date] [varchar](255) NULL, [amount] [float] NULL, [source_system] [varchar](255) NULL ) { "order_id": 5, "customer_id": "1005", "customer_name": "RAMA Karrthikeyan", "order_date": "2026-01-02T15:00:00Z", "updated_date": "2026-01-02T15:00:00Z", "amount": 450.75, "source_system": "orders_app" } create proc Orders_upsert_usp as begin MERGE INTO orders AS tgt USING stg_order AS src ON tgt.order_id = src.order_id WHEN MATCHED THEN UPDATE SET tgt.customer_id = src.customer_id, tgt.customer_name = src.customer_name, tgt.order_date = src.order_date, tgt.updated_date = src.updated_date, tgt.amount = src.amount, tgt.source_system = src.source_system WHEN NOT MATCHED BY TARGET THEN INSERT VALUES ( src.order_id, src.customer_id, src.customer_name, src.order_date, src.updated_date, src.amount, src.source_system ); end