-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-10138. NPE for SstFilteringService in OMDBCheckpointServlet.Lock #6015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@GeorgeJahad @ChenSammi @symious Could you help to review this? |
|
We might also need to synchronize the |
|
If it's for null check, can we change to use |
|
@symious Updated. PTAL. |
| sstFilteringService.getBootstrapStateLock().lock(); | ||
| rocksDbCheckpointDiffer.getBootstrapStateLock().lock(); | ||
| snapshotDeletingService.getBootstrapStateLock().lock(); | ||
| if (keyDeletingService.isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can changed to ifPresent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since lock throws InterruptedException (unlike unlock), it might not be possible to use ifPresent.
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ivandika3 for the patch. I'd like to suggest some changes to simplify the logic.
| private final Optional<BootstrapStateHandler> keyDeletingService; | ||
| private final Optional<BootstrapStateHandler> sstFilteringService; | ||
| private final Optional<BootstrapStateHandler> rocksDbCheckpointDiffer; | ||
| private final Optional<BootstrapStateHandler> snapshotDeletingService; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private final Optional<BootstrapStateHandler> keyDeletingService; | |
| private final Optional<BootstrapStateHandler> sstFilteringService; | |
| private final Optional<BootstrapStateHandler> rocksDbCheckpointDiffer; | |
| private final Optional<BootstrapStateHandler> snapshotDeletingService; | |
| private final List<BootstrapStateHandler.Lock> locks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Updated.
| keyDeletingService = Optional.ofNullable(om.getKeyManager().getDeletingService()); | ||
| sstFilteringService = Optional.ofNullable(om.getKeyManager().getSnapshotSstFilteringService()); | ||
| rocksDbCheckpointDiffer = Optional.ofNullable(om.getMetadataManager().getStore() | ||
| .getRocksDBCheckpointDiffer()); | ||
| snapshotDeletingService = Optional.ofNullable(om.getKeyManager().getSnapshotDeletingService()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| keyDeletingService = Optional.ofNullable(om.getKeyManager().getDeletingService()); | |
| sstFilteringService = Optional.ofNullable(om.getKeyManager().getSnapshotSstFilteringService()); | |
| rocksDbCheckpointDiffer = Optional.ofNullable(om.getMetadataManager().getStore() | |
| .getRocksDBCheckpointDiffer()); | |
| snapshotDeletingService = Optional.ofNullable(om.getKeyManager().getSnapshotDeletingService()); | |
| locks = Stream.of( | |
| om.getKeyManager().getDeletingService(), | |
| om.getKeyManager().getSnapshotSstFilteringService(), | |
| om.getMetadataManager().getStore().getRocksDBCheckpointDiffer(), | |
| om.getKeyManager().getSnapshotDeletingService() | |
| ) | |
| .filter(Objects::nonNull) | |
| .map(BootstrapStateHandler::getBootstrapStateLock) | |
| .collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. PTAL.
| if (keyDeletingService.isPresent()) { | ||
| keyDeletingService.get().getBootstrapStateLock().lock(); | ||
| } | ||
| if (sstFilteringService.isPresent()) { | ||
| sstFilteringService.get().getBootstrapStateLock().lock(); | ||
| } | ||
| if (rocksDbCheckpointDiffer.isPresent()) { | ||
| rocksDbCheckpointDiffer.get().getBootstrapStateLock().lock(); | ||
| } | ||
| if (snapshotDeletingService.isPresent()) { | ||
| snapshotDeletingService.get().getBootstrapStateLock().lock(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (keyDeletingService.isPresent()) { | |
| keyDeletingService.get().getBootstrapStateLock().lock(); | |
| } | |
| if (sstFilteringService.isPresent()) { | |
| sstFilteringService.get().getBootstrapStateLock().lock(); | |
| } | |
| if (rocksDbCheckpointDiffer.isPresent()) { | |
| rocksDbCheckpointDiffer.get().getBootstrapStateLock().lock(); | |
| } | |
| if (snapshotDeletingService.isPresent()) { | |
| snapshotDeletingService.get().getBootstrapStateLock().lock(); | |
| } | |
| for (BootstrapStateHandler.Lock lock : locks) { | |
| lock.lock(); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. PTAL.
| snapshotDeletingService.ifPresent(deletingService -> deletingService.getBootstrapStateLock().unlock()); | ||
| rocksDbCheckpointDiffer.ifPresent( | ||
| rocksDBCheckpointDiffer -> rocksDBCheckpointDiffer.getBootstrapStateLock().unlock()); | ||
| sstFilteringService.ifPresent(filteringService -> filteringService.getBootstrapStateLock().unlock()); | ||
| keyDeletingService.ifPresent(deletingService -> deletingService.getBootstrapStateLock().unlock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| snapshotDeletingService.ifPresent(deletingService -> deletingService.getBootstrapStateLock().unlock()); | |
| rocksDbCheckpointDiffer.ifPresent( | |
| rocksDBCheckpointDiffer -> rocksDBCheckpointDiffer.getBootstrapStateLock().unlock()); | |
| sstFilteringService.ifPresent(filteringService -> filteringService.getBootstrapStateLock().unlock()); | |
| keyDeletingService.ifPresent(deletingService -> deletingService.getBootstrapStateLock().unlock()); | |
| locks.forEach(BootstrapStateHandler.Lock::unlock); |
Note: this will release locks in the same order as they were acquired, instead of in reverse order. I think both are fine as long as the order is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated PTAL.
Note: this will release locks in the same order as they were acquired, instead of in reverse order. I think both are fine as long as the order is fixed.
I think it should be fine.
However, seems like lock and unlock call are not synchronized properly on the Lock object. This might cause possible deadlocks. Shall we synchronize on the lock object (synhronized(this)) inside the lock and unlock call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the lock/unlock methods of OMDBCheckpointServlet.Lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was thinking whether it's necessary to add synchronized block inside the lock and unlock. However, after I think about it again, it should be fine as it is.
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Objects; | ||
| import java.util.Optional; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If suggestion to use list is accepted, this becomes unused.
| import java.util.Optional; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. PTAL.
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ivandika3 for updating the patch, LGTM.
symious
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ivandika3 for the patch.
LGTM.
|
Thanks @ivandika3 for the patch, @hemantk-12, @symious for the review. |
|
Thank you for the reviews @adoroszlai @symious @hemantk-12 |
…apache#6015) (cherry picked from commit abc3e1f)
…Servlet.Lock (apache#6015) (apache#48) (cherry picked from commit abc3e1f)
What changes were proposed in this pull request?
Fix NPE on OMDBCheckpointServlet caused by possible null SstFilteringService. The null SstFilteringService is due to configuration ozone.filesystem.snapshot.enabled=false or ozone.snapshot.filtering.service.interval=-1.
When NPE is thrown in the call of OMDBCheckpointServlet.Lock, this cause the subsequent call to OMDBCheckpointServlet.Lock.lock to block indefinitely since the previous call that triggered the NPE did not release the lock held on keyDeletingService.
The current solution is a simple nullity check. There might be better solution to prevent the NPE than just a nullity check.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10138
How was this patch tested?
Manual test.
Clean CI: https://github.com/ivandika3/ozone/actions/runs/7550769000