HDDS-13905. Bootstrap lock acquired in background services can lead to deadlock#9273
HDDS-13905. Bootstrap lock acquired in background services can lead to deadlock#9273jojochuang merged 4 commits intoapache:masterfrom
Conversation
…o deadlock Change-Id: I0e8a540848cbaddc1e476ccb537dca18134251d1
|
@adoroszlai this is the fix for HDDS-13889 |
Great, please run |
Please wait for clean CI run in fork before opening PR. |
Change-Id: I09e75187190855c430d4d5848d313f6f3ba87a66
Ran it 10x10 times all of them have passed https://github.com/swamirishi/ozone/actions/runs/19223007061 |
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a deadlock issue where bootstrap lock acquisition was happening after snapshots were opened, causing background services to deadlock with the bootstrap flow. The fix ensures that all background services acquire the bootstrap lock before opening any snapshots, preventing the deadlock condition.
Key Changes
- Replaced Semaphore-based locking with ReadWriteLock in
BootstrapStateHandler.Lockto allow concurrent read access for background services while bootstrap takes an exclusive write lock - Introduced
DeletingServiceTaskQueueinAbstractKeyDeletingServicethat wraps all background tasks to automatically acquire read locks before execution - Updated
SstFilteringServiceto acquire bootstrap lock before opening snapshots instead of after
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
BootstrapStateHandler.java |
Core locking mechanism changed from Semaphore to ReadWriteLock with separate read/write lock acquisition methods |
AbstractKeyDeletingService.java |
Added DeletingServiceTaskQueue wrapper class that automatically acquires read lock before task execution |
KeyDeletingService.java |
Updated to use DeletingServiceTaskQueue and removed bootstrap lock acquisition from individual operations |
DirectoryDeletingService.java |
Updated to use DeletingServiceTaskQueue and removed manual bootstrap lock handling |
SnapshotDeletingService.java |
Updated to use DeletingServiceTaskQueue and removed manual bootstrap lock handling |
SstFilteringService.java |
Moved bootstrap lock acquisition to before snapshot opening instead of during SST file operations |
RocksDBCheckpointDiffer.java |
Updated lock acquisition calls to use new acquireReadLock() API |
DBCheckpointServlet.java |
Updated Lock implementation to return UncheckedAutoCloseable instead of self-reference |
OMDBCheckpointServlet.java |
Updated Lock to aggregate multiple locks and return composite closeable |
OMDBCheckpointServletInodeBasedXfer.java |
Updated to use new acquireWriteLock() API |
| Test files | Updated test code to use new lock API and removed flaky test annotation from previously deadlocking test |
Comments suppressed due to low confidence (3)
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:1266
- This catch block is unreachable because
acquireReadLock()doesn't actually throwInterruptedException(the underlyinglock.lock()method is not interruptible). Either switch tolockInterruptibly()in the base Lock implementation, or remove this catch block.
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:1105
- This catch block is unreachable because
acquireReadLock()doesn't actually throwInterruptedException(the underlyinglock.lock()method is not interruptible). Either switch tolockInterruptibly()in the base Lock implementation, or remove this catch block.
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServlet.java:664
- Lock has the same name as its supertype org.apache.hadoop.ozone.lock.BootstrapStateHandler$Lock.
static class Lock extends BootstrapStateHandler.Lock {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class Lock implements AutoCloseable { | ||
| private final Semaphore semaphore = new Semaphore(1); | ||
| /** Bootstrap state handler lock implementation. Should be always acquired before opening any snapshot to avoid | ||
| * deadlocks*/ |
There was a problem hiding this comment.
Missing space after period in the comment. Should be "Should be always acquired before opening any snapshot to avoid deadlocks."
| * deadlocks*/ | |
| * deadlocks */ |
| * is wrapped such that its execution acquires a read lock via | ||
| * {@code getBootstrapStateLock().acquireReadLock()} before performing any | ||
| * operations. The lock is automatically released upon task completion or | ||
| * exception, ensuring safe concurrent execution of tasks within the service when running along with bootstrap flow. |
There was a problem hiding this comment.
[nitpick] The Javadoc comment should end with "ensures safe concurrent execution of tasks within the service when running alongside bootstrap flow." (changed "along with" to "alongside" for better grammar).
| * exception, ensuring safe concurrent execution of tasks within the service when running along with bootstrap flow. | |
| * exception, ensuring safe concurrent execution of tasks within the service when running alongside bootstrap flow. |
| public UncheckedAutoCloseable acquireWriteLock() throws InterruptedException { | ||
| return lock(false); | ||
| } | ||
|
|
||
| @Override | ||
| public void close() { | ||
| unlock(); | ||
| public UncheckedAutoCloseable acquireReadLock() throws InterruptedException { | ||
| return lock(true); |
There was a problem hiding this comment.
The methods acquireWriteLock() and acquireReadLock() declare throws InterruptedException, but they cannot actually throw this exception. The underlying lock.lock() call (line 35) is a blocking operation that doesn't throw InterruptedException.
If you need interruptible locking, use lock.lockInterruptibly() instead. Otherwise, remove the throws InterruptedException declaration from both methods.
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/DBCheckpointServlet.java
Outdated
Show resolved
Hide resolved
jojochuang
left a comment
There was a problem hiding this comment.
My understanding of the change, is that previously, the bootstrap lock is held when submitting internal requests (directory delete, key delete). That is moved to much later and bootstrap lock is held when the requests are being executed.
|
|
||
| // Submit Purge paths request to OM. Acquire bootstrap lock when processing deletes for snapshots. | ||
| try (BootstrapStateHandler.Lock lock = snapTableKey != null ? getBootstrapStateLock().lock() : null) { | ||
| try { |
| .setPurgeKeysRequest(requestBuilder.build()).setClientId(getClientId().toString()).build(); | ||
|
|
||
| try (Lock lock = snapTableKey != null ? getBootstrapStateLock().lock() : null) { | ||
| try { |
| .setClientId(clientId.toString()) | ||
| .build(); | ||
|
|
||
| try (BootstrapStateHandler.Lock lock = getBootstrapStateLock().lock()) { |
| .setSnapshotMoveTableKeysRequest(moveDeletedKeys) | ||
| .setClientId(clientId.toString()) | ||
| .build(); | ||
| try (BootstrapStateHandler.Lock lock = getBootstrapStateLock().lock()) { |
|
|
||
| try ( | ||
| UncheckedAutoCloseableSupplier<OmSnapshot> snapshotMetadataReader = | ||
| // Acquire bootstrap lock before opening any snapshot. |
There was a problem hiding this comment.
moved the bootstrap lock earlier
| lock.lock(); | ||
| return lock::unlock; |
There was a problem hiding this comment.
This line returns a method reference to unlock the acquired lock. Specifically, it creates an UncheckedAutoCloseable instance that, when closed, calls unlock() on the same lock, releasing it. This enables usage of the lock in a try-with-resources style, ensuring the lock will be released when no longer needed.
Change-Id: Ib5a872948e090cbf1f3cc51dd3a92f159429e835 # Conflicts: # hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java
Change-Id: Icc3420a005fe68db59aeb9ede68a7f6cc4e22831
|
Merged. Thanks @swamirishi |
What changes were proposed in this pull request?
Currently bootstrap lock is acquired after the snapshot is already opened thus this can lead to a deadlock condition during bootstrap where the Bootstrap flow has already acquired a bootstrap lock and is waiting on snapshot cache lock to be acquired which cannot be acquired since the snapshots are still open. All background services acquire bootstrap lock while still having a snapshot open which is going to cause a deadlock here.
To fix this all background services should always acquire bootstrap lock before opening a snapshot. The only con to this is that the entire task of background service would be blocked when the bootstrap copy batch is running on the leader om which should be ok since bootstrap would be an infrequent operation.
However one possible improvement would be to just to create a hardlink(which we already do to ensure this file doesn't get deleted by rocksdb operations but we also write the file into the Tarball stream synchronously
ozone/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/Archiver.java
Lines 135 to 155 in 96390ac
The above can be done as part of a separate patch @sadanand48
https://issues.apache.org/jira/browse/HDDS-13906
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13905
How was this patch tested?
Existing unit tests which was stuck because of the deadlock and marked flaky because of 272544aa95c66d0743f9ac94b2f4d586b325b8a7