-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-7370. Add pending commands in SCM to Datanode command count #3867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -39,6 +39,8 @@ | |
| import org.slf4j.LoggerFactory; | ||
|
|
||
| import java.io.IOException; | ||
| import java.util.HashMap; | ||
| import java.util.Map; | ||
|
|
||
| import static org.apache.hadoop.hdds.upgrade.HDDSLayoutVersionManager.maxLayoutVersion; | ||
|
|
||
|
|
@@ -77,35 +79,51 @@ public void resetEventCollector() throws IOException { | |
| public void testQueueReportProcessed() throws NodeNotFoundException { | ||
| DatanodeDetails dn = MockDatanodeDetails.randomDatanodeDetails(); | ||
| nodeManager.register(dn, null, null); | ||
|
|
||
| // Add some queued commands to be sent to the DN on this heartbeat. This | ||
| // means the real queued count will be the value reported plus the commands | ||
| // sent to the DN. | ||
| Map<SCMCommandProto.Type, Integer> commandsToBeSent = new HashMap<>(); | ||
| commandsToBeSent.put(SCMCommandProto.Type.valueOf(1), 2); | ||
| commandsToBeSent.put(SCMCommandProto.Type.valueOf(2), 1); | ||
|
|
||
| SCMDatanodeHeartbeatDispatcher.CommandQueueReportFromDatanode | ||
| commandReport = getQueueReport(dn); | ||
| commandReport = getQueueReport(dn, commandsToBeSent); | ||
| commandQueueReportHandler.onMessage(commandReport, this); | ||
|
|
||
| int commandCount = commandReport.getReport().getCommandCount(); | ||
| for (int i = 0; i < commandCount; i++) { | ||
| int storedCount = nodeManager.getNodeQueuedCommandCount(dn, | ||
| commandReport.getReport().getCommand(i)); | ||
| Assertions.assertEquals(commandReport.getReport().getCount(i), | ||
| storedCount); | ||
| int expectedCount = commandReport.getReport().getCount(i); | ||
| // For the first two commands, we added extra commands | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hello Stephen! Just for my self-learning, why the first two commands we add extra commands~? Thanks~ 🙏
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is just to test the new part of the code - that the queued commands in the list are added to the totals in the report count. I did not add extra commands to all of them, so I could see the others work ok when there isn't extra commands.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. got it! thanks Stephen! |
||
| if (i == 0) { | ||
| expectedCount += 2; | ||
| } | ||
| if (i == 1) { | ||
| expectedCount += 1; | ||
| } | ||
| Assertions.assertEquals(expectedCount, storedCount); | ||
| Assertions.assertTrue(storedCount > 0); | ||
| } | ||
| } | ||
|
|
||
| private SCMDatanodeHeartbeatDispatcher.CommandQueueReportFromDatanode | ||
| getQueueReport(DatanodeDetails dn) { | ||
| getQueueReport(DatanodeDetails dn, | ||
| Map<SCMCommandProto.Type, Integer> commandsToBeSent) { | ||
| CommandQueueReportProto.Builder builder = | ||
| CommandQueueReportProto.newBuilder(); | ||
| int i = 10; | ||
| for (SCMCommandProto.Type cmd : SCMCommandProto.Type.values()) { | ||
| if (cmd == SCMCommandProto.Type.unknownScmCommand) { | ||
| // Do not include the unknow command type in the message. | ||
| // Do not include the unknown command type in the message. | ||
| continue; | ||
| } | ||
| builder.addCommand(cmd); | ||
| builder.addCount(i++); | ||
| } | ||
| return new SCMDatanodeHeartbeatDispatcher.CommandQueueReportFromDatanode( | ||
| dn, builder.build()); | ||
| dn, builder.build(), commandsToBeSent); | ||
| } | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick question, why was this not done before and done now? I understand we would like to keep the current count and replace the stale counters. I thought it was already done by
commandCounts.put(command, cmdCount)and also now withcommandCounts.putAll(mutableCmds).Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow your question.
There are two counts. The ones reported from the DN on its heartbeart - these are queued on the DN.
Then there are commands queued in SCM ready to be sent to the DN as part of processing the heartbeat.
After the heartbeat, the real queue on the DN is the sum of the two (queued on DN + queued on SCM).
Before this change, we only stored the "queued on the DN count". This count is not yet used anywhere, but I missed adding the commands the heartbeat will add to the queue, so the old count is not correct.
The line
commandCounts.put(command, cmdCount)should do all the updates as per the comment in the code. The linecommandCounts.putAll(mutableCmds)is a "catch all" incase the DN is not reporting counts for a command that is queued on SCM - it shouldn't happen in practice. I have also removed the entry from mutableCommands after adding it to the cmdCount, so in practice mutableCommands should be empty by the time it gets to the putAll() call.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation. My initial doubt was
commandCounts.clear();was added now in this patch, I was just wondering why wasn't it added before. I guess the initial thought was that the command not reported by DN should be 0, it would be cleared anyway (with 0) by thecommandCounts.put(command, cmdCount).The changes look good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea not sure why I didn't have that before. I guess we could omit it, but I should not do any harm either.