HDFS-16600. Fix deadlock of fine-grain lock for FsDatastImpl of DataNode.#4367
HDFS-16600. Fix deadlock of fine-grain lock for FsDatastImpl of DataNode.#4367Hexiaoqiao merged 1 commit intoapache:trunkfrom
Conversation
|
@Hexiaoqiao @ayushtkn This bug was introduced by HDFS-16534. Please help me review it, thanks |
ayushtkn
left a comment
There was a problem hiding this comment.
Well the reason of Deadlock seems valid, but won't we be taking the write lock now for other purpose as well, where we actually could have lived with read lock? We should take write lock where we actually require, like the case we are chasing now. for other cases we should stay with read lock only, may be try some refactoring and restrict the write lock for this particular case only
|
Got it. I will gain a deep understanding of the scope of this BP lock and try to solve this case with ReadLock. |
|
💔 -1 overall
This message was automatically generated. |
|
I learned from the implementation of moveBlockAcrossStorage and used BP ReadLock to fix this bug. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Hexiaoqiao
left a comment
There was a problem hiding this comment.
@ZanderXu Thanks for great catch here. It makes sense to me. I am confused it will be safe to process following logic if obtain BLOCK_POOL lock first as expected, but why it leads deadlock.
cc @MingXiangLi Please take another review.
|
Thanks @Hexiaoqiao for your review.
Deadlock is trigged when evictLazyPersistBlocks is required in createRbw. Because createRbw hold BLOCK_POOL read lock, but evictLazyPersistBlocks try to hold BLOCK_POOL write lock. org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.testSynchronousEviction is failed because the DeadLock. |
|
@ZanderXu LGTM.The core logic in HDFS-16534 is change block pool write lock to read lock and add volume lock for each replica under this block pool.And we didn't change this method in HDFS-16534 because it's not a heavy call. So this commit is makes sense to me. |
|
Thanks @MingXiangLi for your review. HDFS-16534 means a lot to me and I learned a lot from it. |
|
@ZanderXu I found that some tests of Junit Test of HDFS (DN) sometimes succeed and sometimes fail, but there is no way to judge whether it is related to DeadLock. How do you judge the occurrence of DeadLock? @MingXiangLi HDFS-16534 is a very big change, which will greatly help the performance improvement of DN, but ZanderXu has already proposed 2 Jiras for this change. Can you help to re-examine this HDFS-16534, if it is separate each time The commit fixes pr, worried that it will bring more problems. |
|
@slfan1989 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement.testSynchronousEviction sometimes succeed? I have run it several times locally and all fail.
Deadlock is trigged when evictLazyPersistBlocks is required in createRbw. Because createRbw hold BLOCK_POOL read lock, but evictLazyPersistBlocks try to hold BLOCK_POOL write lock. |
|
Thank you for your contribution, but I still have some concerns about HDFS-16534. I feel that for a new feature, multiple prs should not be used to fix the problem separately, which makes the code very difficult to read. I recommend creating it under HDFS-15382 A subtask to fix HDFS-16598 and HDFS-16600 together, @ZanderXu @MingXiangLi. |
|
@ZanderXu Thanks for the great catch here.
@slfan1989 Thansk for your suggestions. IMO, this is not the blocker issue. Any tickets will be collected to subtask of HDFS-16534 before checkin for committers. -1 to combine HDFS-16598 and HDFS-16600 together. IIUC, it is recommended to add/fix one issue for one ticket. Welcome to any more discussions. |
|
Thanks @Hexiaoqiao for your review, and I will check other methods. |
|
@ZanderXu @Hexiaoqiao Thank you very much everyone, I learned a lot from the discussion, I didn't pay attention to this pr, because the description information is too short. I will summarize the calling process in the comment area of HDFS-16600 Leave a message (ASAP), and I hope @ZanderXu @Hexiaoqiao you can help me check it too. Another suggestion, can you write the junit test? @ZanderXu |
you can see the UT org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement.testSynchronousEviction. |
ayushtkn
left a comment
There was a problem hiding this comment.
Changes LGTM. If possible would be good if a test can be added.
Regarding:
Thansk for your suggestions. IMO, this is not the blocker issue. Any tickets will be collected to subtask of HDFS-16534 before checkin for committers. -1 to combine HDFS-16598 and HDFS-16600 together. IIUC, it is recommended to add/fix one issue for one ticket. Welcome to any more discussions.
Moving as subtask, yes we should do that, I too don't think we should merge issues unless they are same or extension of one. Separate issues should be tackled separately only, else if one gets screwed we need to revert both.
Regarding having one PR for all, don't know how we can achieve that once merged. you have to revert it and redo all the work. doesn't look something required here though. I am ok with whatever @Hexiaoqiao feel is good
|
Thanks @ayushtkn for your comment.
I found this issues because the UT org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement.testSynchronousEviction failed, so I thinks it's a nice UT to verify this issue. |
|
Thanks all for the further discussion. About the unit test, I did not retrieve this one, |
|
Oh, I'm sorry, the failed UT is |
Thanks @ZanderXu for your information. I think it can cover this case, let's wait what Ayush think about it. |
|
Makes sense to me. Thanx everyone |
|
@Hexiaoqiao @ZanderXu @MingXiangLi I still have questions to ask.
If ZanderXu's judgment is correct, will this code also deadlock? 3.I always have a question, why we first add blockpool readlock, and then add volume write lock, how is the order of this lock derived? 4.I checked lockManager.writeLock(LockLevel.BLOCK_POOl, block.getBlockPoolId()), and I found that when adding volume, the writeLock of BLOCK_POOl is also used, so will it also deadlock(When createRbw And addVolume are done at the same time)?
I don't think this is a deadlock. Is it because createRow got the read lock, which caused evictBlocks to get the write lock for a long time, and then exceeded the waiting time of the junit test, which eventually led to an error? I think to solve this problem completely, we also need to look at the processing logic of LazyWriter. It should not be enough to just modify evictBlocks. |
|
Thanks @slfan1989 for your comment. Please refer to the stack:
evictBlocks could not successfully acquire the write lock, since createRBW_logic holds the read lock of this block pool. And createRBW_logic is waiting for evictBlocks to finish. so it's deadlock.
I'm interested in this deadlock, can you provide a reproduction process? thanks~ |
Thank you very much, I have understood that this is indeed a deadlock, because the same thread needs to use both a read lock and a write lock.
very good explanation.
Thanks for your patience in explaining, this is my guess, now it looks like this won't happen(deadlock) because createRbw And addVolume won't be executed in the same thread, and createRbw And LazyWriter won't deadlock because they're not executed in one thread. LGTM +1
why we first add blockpool readlock, and then add volume write lock, how is the order of this lock derived? |
|
@slfan1989 You can refer this https://issues.apache.org/jira/browse/HDFS-15382 jira which may answer your question. |
|
Yeah, about last question, @slfan1989 you can prefer to HDFS-15382. If you have any questions, feel free to concat me and discuss it. |
|
Retrigger jenkins and wait to another build result. |
|
💔 -1 overall
This message was automatically generated. |
|
The latest build looks good to me. Committed to trunk. |
|
Thanks @Hexiaoqiao @ayushtkn @MingXiangLi @slfan1989 for your warm discussions and review. |
…ode. (apache#4367). Contributed by ZanderXu. Reviewed-by: Mingxiang Li <liaiphag0@gmail.com> Reviewed-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…ode. (apache#4367). Contributed by ZanderXu. Reviewed-by: Mingxiang Li <liaiphag0@gmail.com> Reviewed-by: Ayush Saxena <ayushsaxena@apache.org> Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
Detail info please refer to HDFS-16600.
The UT org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement.testSynchronousEviction failed, because happened deadlock, which is introduced by HDFS-16534.