Migrations: Fix RetrustForeignKeyAndCheckConstraints failing when data violates a constraint#22488
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the RetrustForeignKeyAndCheckConstraints upgrade migration (introduced in 17.3) to avoid upgrade crashes when re-trusting constraints fails due to existing data violating a FK/CK constraint, by executing the validation on a separate database connection instead of within the migration scope’s transaction.
Changes:
- Inject
IUmbracoDatabaseFactoryinto the migration to create isolated database connections forALTER TABLE ... WITH CHECK CHECK CONSTRAINT. - Execute each constraint re-trust attempt using a separate
IUmbracoDatabaseinstance (no explicit transaction), logging warnings for failures instead of crashing the upgrade.
PR ReviewTarget: Fixes
Critical
Important
Suggestions
Request ChangesThe breaking constructor change must be addressed with the Obsolete + StaticServiceProvider pattern before merge. The |
RetrustForeignKeyAndCheckConstraints failing when data violates a constraint
Zeegaan
left a comment
There was a problem hiding this comment.
Looks good from a code perspective, thank you for the detailed testing steps 💪
…ata violates a constraint (#22488) * Fix exception handling in RetrustForeignKeyAndCheckConstraints migration step. * Addressed code review feedback.
Description
This fixes the
RetrustForeignKeyAndCheckConstraintsmigration step which has been shown to break the upgrade when existing data violates a foreign key or check constraint.The migration was introduced in 17.3 to re-trust untrusted constraints (where
is_not_trusted = 1) so the SQL Server query optimizer can use them for join elimination and cardinality estimation. When all constraints pass validation this works fine, but when a constraint fails validation (e.g. orphaned FK rows from a previous bug), the upgrade crashes with:Root cause
The original code executed
ALTER TABLE ... WITH CHECK CHECK CONSTRAINTon the migration scope's database connection, which has an active .NETSqlTransaction. Although a SQL level TRY...CATCH was in place, when validation fails, the constraint violation error zombies theSqlTransactionat the TDS (Tabular Data Stream) protocol layer — the transaction state change propagates to the .NETSqlClientbefore T-SQL error handling can contain it. Once the transaction is zombied:context.Complete()(which writes the migration state viaIKeyValueService) fails because it uses the same dead transaction.scope.Complete()is never reached."This SqlTransaction has completed"exception.Fix
Each
ALTER TABLE ... WITH CHECK CHECK CONSTRAINTnow executes on a separate database connection created viaIUmbracoDatabaseFactory. This connection operates in SQL Server's autocommit mode (no explicitSqlTransaction), which means:TRY...CATCHfully contains the error — noSqlExceptionis thrown to .NET.context.Complete()andscope.Complete()succeed.Testing
Setup — create a constraint violation database before upgrading:
Test 1 — upgrade completes without crashing:
Revert the database to the migration step prior to the
RetrustForeignKeyAndCheckConstraintsstep:"Constraint re-trust complete: X succeeded, Y failed out of Z total.". Expect at least one failure.Test 2 — clean constraints are re-trusted:
Check which constraints are still untrusted after upgrade. Its expected that only the intentionally violated FK remains untrusted.
Remove the orphaned data:
Revert the database again to the migration step prior to the
RetrustForeignKeyAndCheckConstraintsstep:Start up the application again and confirm the migration succeeds.
You should see an information log message of:
Verify that the constraint is now trusted (expect
is_not_trusted = 0)