Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses SQL Server cross-row contention in Umbraco’s distributed locking by forcing row-level locks on the umbracoLock table, and adds integration tests intended to reproduce/validate the contention scenario described in #22113.
Changes:
- Add
ROWLOCKhint to SQL Server distributed locking SQL (NPoco + EF Core) to avoid page-lock contention onumbracoLock. - Add new SQL Server-only integration tests that simulate/force page locks vs row locks across
ContentTreeandDistributedJobslock IDs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tests/Umbraco.Tests.Integration/Umbraco.Infrastructure/Persistence/LocksTests.cs | Adds new long-running concurrency/locking tests to demonstrate page-lock contention and validate ROWLOCK behavior. |
| src/Umbraco.Cms.Persistence.SqlServer/Services/SqlServerDistributedLockingMechanism.cs | Updates SQL Server NPoco locking queries to include ROWLOCK. |
| src/Umbraco.Cms.Persistence.EFCore/Locking/SqlServerEFCoreDistributedLockingMechanism.cs | Updates SQL Server EF Core locking queries to include ROWLOCK. |
You can also share your feedback on Copilot code review. Take the survey.
tests/Umbraco.Tests.Integration/Umbraco.Infrastructure/Persistence/LocksTests.cs
Outdated
Show resolved
Hide resolved
tests/Umbraco.Tests.Integration/Umbraco.Infrastructure/Persistence/LocksTests.cs
Outdated
Show resolved
Hide resolved
tests/Umbraco.Tests.Integration/Umbraco.Infrastructure/Persistence/LocksTests.cs
Outdated
Show resolved
Hide resolved
88db541 to
4f53251
Compare
Migaroez
left a comment
There was a problem hiding this comment.
This seems like a sensible change and it for sure enforces the design of the data. Only the Contexts (each represented by a single entry in the table) that are relevant to an operation should be locked, the others should be left alone.
I doubt that this will make the locking issues we will most likely reduce the their occurance.
Description
Follows investigation into: #22113
This PR adds a
ROWLOCKtable hint to SQL Server distributed locking queries (both NPoco and EF Core) to prevent SQL Server from choosing page-level lock granularity on the smallumbracoLocktable.Analysis
The
umbracoLocktable has ~18 rows which all fit on a single 8KB SQL Server data page. The existing locking SQL usesWITH (REPEATABLEREAD)but noROWLOCKhint, leaving lock granularity up to SQL Server's query optimizer.From what I understand, it's possible for SQL Server to choose a page or table lock here, which could lead problems similar to that described in the reported issue.
For example, when a long-running content operation (e.g.,
EmptyRecycleBin) holds a lock onContentTree(-333), SQL Server may use a page-level lock that also covers theDistributedJobsrow (-347). Other servers pollingTryTakeRunnableAsyncevery 5 seconds then fail to acquireWriteLock(-347), producingDistributedWriteLockTimeoutException("Failed to acquire write lock for id: -347").Adding
ROWLOCKensures SQL Server locks only the specific row being accessed, so locks on different IDs never contend with each other regardless of table size or optimizer decisions.Testing
I've used some local integration tests to verify that if a page lock is explicitly requested, then we do see the contention between the row locks. But as this isn't deterministic it's not possible to demonstrate conclusively that this problem occurs and is resolved by this update.
My suggestion is that I can't see any harm in doing this. We definitely want row level locking on this table, as that's the point of being able to lock certain aggregate roots and not others. So being explicit could be useful here in some where lock escalation to a page or table level could otherwise occur.