-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Open
Labels
area/connectorsConnector related issuesConnector related issuesautoteamcommunityneeds-triageteam/extensibilityteam/usetype/bugSomething isn't workingSomething isn't working
Description
Connector Name
destination-mssql
Connector Version
2.2.12
What step the error happened?
During the sync
Relevant information
The MSSQL bulk loader is creating duplicate records in destination tables even when using "Incremental | Append + Dedup" mode. The issue occurs specifically with bulk insert operations and results in multiple rows with identical _airbyte_raw_id values.
- Expected Behavior:
- Records with the same _airbyte_raw_id should be de-duplicated
- Only one record per _airbyte_raw_id should exist in the destination table
- The MERGE statement should properly handle de-duplication based on _airbyte_raw_id
Actual Behavior: - Multiple records with identical _airbyte_raw_id values are inserted
- The logs show "Deduplication completed for temp table" and "MERGE completed successfully" but duplicates still exist
- This only occurs with bulk insert mode; insert mode works correctly
Root Cause:
The issue is in MSSQLBulkLoader.kt in the handleDedup method (lines 48-49). The de-duplication logic uses the business primary key (Id) instead of _airbyte_raw_id as the deduplication key:
val primaryKey =
if (importType.primaryKey.isNotEmpty()) {
importType.primaryKey.flatten() // Uses business PK (e.g., "Id")
} else {
importType.cursor
}
This means de-duplication happens based on business keys + cursor, not on _airbyte_raw_id, which is Airbyte's internal unique identifier.

Relevant log output
Successful De-duplication Messages (Contradictory):
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoadHandler(bulkLoadAndUpsertForDedup):56 Starting bulk insert into table: Airbyte.Campaign from file: blob/Campaign/2025/09/04/1756657454230/0
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoadHandler(bulkLoadAndUpsertForDedup):67 Deduplication completed for temp table
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoadHandler(bulkLoadAndUpsertForDedup):78 MERGE completed successfully into table: Airbyte.Campaign
Stream Configuration (Showing Dedup Mode):
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoader(handleDedup):48 Deduplicating blob/Campaign/2025/09/04/1756657454230/0 into table for Descriptor(namespace=null, name=Campaign)
Bulk Load Process (Showing Bulk Insert Mode):
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoader(handleAppendOverwrite):83 Loading blob/Campaign/2025/09/04/1756657454230/0 into table for Descriptor(namespace=null, name=Campaign)
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoadHandler(bulkLoadForAppendOverwrite):56 Starting bulk insert into table: Airbyte.Campaign from file: blob/Campaign/2025/09/04/1756657454230/0
Stream Status Updates (Showing Incremental Mode):
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoader(handleDedup):48 Stream status TRACE received of status: COMPLETE for stream edw_crm:Campaign
2025-09-04T06:34:25.4306142Z INFO DefaultDispatcher-worker-2 i.a.i.d.m.v.MSSQLBulkLoader(handleDedup):48 Stream Status Update Received: edw_crm:Campaign - COMPLETE
Contribute
- Yes, I want to contribute
Metadata
Metadata
Assignees
Labels
area/connectorsConnector related issuesConnector related issuesautoteamcommunityneeds-triageteam/extensibilityteam/usetype/bugSomething isn't workingSomething isn't working