Fix deadlock in ThreadPoolMergeScheduler when a failing merge closes the IndexWriter#134128
Closed
tlrx wants to merge 7 commits intoelastic:mainfrom
Closed
Fix deadlock in ThreadPoolMergeScheduler when a failing merge closes the IndexWriter#134128tlrx wants to merge 7 commits intoelastic:mainfrom
tlrx wants to merge 7 commits intoelastic:mainfrom
Conversation
…the indexWriter Relates ES-12664
Collaborator
|
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
Collaborator
|
Hi @tlrx, I've created a changelog YAML for you. |
Member
Author
|
Closed in favor of #134656 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A merge that throws an exception causes the closing of the IndexWriter, which in turn aborts running merges and closes the ThreadPoolMergeScheduler in the same merge thread.
Before this change, ThreadPoolMergeScheduler#close would use a CountDownLatch to wait for the signal that all merges have been aborted/completed. But closing of the merge scheduler is executed in a merge thread that is not yet completed at the time it waits on the latch, causing a deadlock.
The proposed fix in this change uses a mechanism similar to what
ConcurrentMergeScheduler#syncdoes, ie waits on all merge threads to be aborted/completed except the current one.The proposed test works when ThreadPoolMergeScheduler is enabled or not. I'd like to add a similar test in serverless too, just to be sure it works everywhere.
Relates ES-12664