-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8882. Manage status of DeleteBlocksCommand in SCM to avoid sending duplicates to Datanode #4988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… sending duplicate delete transactions to the DN
|
// todo. Implementing unit test and integration test, |
|
No such command.
|
1 similar comment
|
No such command.
|
|
@adoroszlai PTAL Thanks. |
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xichen01 thanks for working over this, this seems good improvement to send new blocks and retry with some delay avoiding duplicate command. This is feasible now after removal of strict ordering of transactionId check at DN HDDS-8228. The metrics added for outOfOrder may not be required now at Dn with this change as it will be common to be out-of-order.
Additionally, at SCM, state is managed in DB with retry, and multiple map. We need relook and refactor to have combined state for the Txs.
...-scm/src/main/java/org/apache/hadoop/hdds/scm/block/SCMDeleteBlocksCommandStatusManager.java
Outdated
Show resolved
Hide resolved
.../main/java/org/apache/hadoop/ozone/container/common/report/CommandStatusReportPublisher.java
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
Outdated
Show resolved
Hide resolved
...-scm/src/main/java/org/apache/hadoop/hdds/scm/block/SCMDeleteBlocksCommandStatusManager.java
Outdated
Show resolved
Hide resolved
…CommandStatusManager
Yes, |
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xichen01 Thanks for update, given few comments for this PR. Overall looks good.
Will recheck for commandStatusMap for cleanup after fix.
.../src/main/java/org/apache/hadoop/hdds/scm/block/SCMDeletedBlockTransactionStatusManager.java
Show resolved
Hide resolved
.../src/main/java/org/apache/hadoop/hdds/scm/block/SCMDeletedBlockTransactionStatusManager.java
Show resolved
Hide resolved
.../src/main/java/org/apache/hadoop/hdds/scm/block/SCMDeletedBlockTransactionStatusManager.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/hadoop/hdds/scm/block/SCMDeletedBlockTransactionStatusManager.java
Show resolved
Hide resolved
...-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/SCMBlockDeletingService.java
Outdated
Show resolved
Hide resolved
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
Outdated
Show resolved
Hide resolved
… useless code; Fix thread issue
# Conflicts: # hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
|
Thanks @xichen01 for updating the patch. Can you please check |
|
@adoroszlai @sumitagrawl |
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xichen01 LGTM
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @xichen01 for the patch.
| SCMDeletedBlockTransactionStatusManager | ||
| getSCMDeletedBlockTransactionStatusManager(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeletedBlockLog interface is defined in terms of operations . I don't think exposing a manager object is appropriate for the interface, it should be an implementation detail. Similarly, sharing the same lock between the two objects does not seem right.
Maybe the interface should define operations that the implementation passes through to the manager. Alternatively the manager object should have an interface defined separately, and act as a way to manipulate the DeletedBlockLog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the getSCMDeletedBlockTransactionStatusManager interface from DeletedBlockLog and added DeletedBlockTransactionStatusManager related actions to DeletedBlockLog.
…ng duplicates to Datanode (apache#4988) (cherry picked from commit 88e18e3)
|
During our use of deletion, I noticed that it can be very slow, especially after we switched to the EC policy. Our Ozone01 cluster currently has about 1K machines. Initially, we chose to use a The following chart shows the deletion speed for The following chart shows the deletion speed for By reviewing the code and analyzing the logs, we found that the following situation can cause deletion to be very slow. We will illustrate this with an example.
We want to delete data from an EC container with ContainerId = 1000. Since it is EC-6-3, there are 9 replicas (DN1, DN2, DN3, ... DN9).
Before deletion, we first select a batch of DNs; at this time, we may only select DN1 to DN6. We then send the deletion command to these 6 DNs, and the command executes normally, successfully deleting 6 blocks. However, if DN7 to DN9 are not selected, our deletion process will get stuck.
ozone/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java Lines 294 to 313 in 1f86ce8
I came up with a possible solution to eliminate this stuck situation. We require that all replicas of the container to be deleted must be present in the selected DN list simultaneously. Otherwise, we will skip that container. |
|
We rolled out this improvement internally in the SCM around 7 PM on September 26th, and we observed a significant enhancement in deletion efficiency, with 50 million blocks being fully processed within 5 hours. The core aspect of this improvement is to ensure that all DNs within the same container receive the delete command simultaneously. When they send their ACKs, they can reach the SCM at approximately the same time, which facilitates the confirmation of block deletions. I would like to prepare a PR and submit this change to the community. |
|
@slfan1989, The issue you found, Is your commit merged in upstream, what is the Jira ID? |
@ashishkumar50 Thanks for the question! The relevant JIRA issue should be HDDS-11498, and this PR has already been merged. The configuration I used during the deletion process are as follows: |
… avoid sending duplicates to Datanode (apache#4988) (cherry picked from commit 88e18e3)



What changes were proposed in this pull request?
Currently SCM will send a duplicate
DeletedBlocksTransactionto the specify DN if the DN not report the transactions have been finish by the Heartbeat. So if theDeleteBlocksCommandHandlerThread of a DN was Blocked cause by some reason (Such as wait Container lock) the SCM will send a duplicateDeletedBlocksTransactionto this DN.Summary
The Status of
DeleteBlocksCommandState Transfer
TO_BE_SENT -> SENT: The DeleteBlocksCommand is sent by SCM, The follow-up status has not been updated by Datanode.
SENT -> null (remove state recode from
SCMDeleteBlocksCommandStatusManager)Once the DN executes DeleteBlocksCommands, regardless of whether DeleteBlocksCommands is executed successfully or not, it will be deleted from record.
Successful DeleteBlocksCommands are recorded in
SCMDeletedBlockTransactionStatusManager#transactionToDNsCommitMap.DeleteBlocksCommand resent
The
DeleteBlocksCommandon theTO_BE_SENT, SENTwill not be resent by SCM.SCMDeletedBlockTransactionStatusManager
SCMDeletedBlockTransactionStatusManagercontains thetransactionToDNsCommitMapmigrated fromDeletedBlockLogImpluse to manage the commitedDeletedBlocksTransaction.And the
SCMDeletedBlockTransactionStatusManager#SCMDeleteBlocksCommandStatusManageruse to manage theDeletedBlocksTransactionwhich are uncommited.The "commited" means that
DeletedBlockTransactionis executed on DN and reported to SCM by the heartbeatWhat is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8882
Please replace this section with the link to the Apache JIRA)
How was this patch tested?
integration test