-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-12998. Bring real container size in pb message when exporting/importing containers #8915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@peterxcli can you please review it. |
hadoop-hdds/interface-client/src/main/proto/DatanodeClientProtocol.proto
Outdated
Show resolved
Hide resolved
peterxcli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Gargi-jais11 for this patch, left some comments, PTAL.
Also please could you try add some test coverage on your change? Thanks!
hadoop-hdds/interface-client/src/main/proto/DatanodeClientProtocol.proto
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/ozone/container/replication/DownloadAndImportReplicator.java
Outdated
Show resolved
Hide resolved
...r-service/src/main/java/org/apache/hadoop/ozone/container/replication/ContainerImporter.java
Show resolved
Hide resolved
Ok |
peterxcli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry my previous idea might be wrong.
...r-service/src/main/java/org/apache/hadoop/ozone/container/replication/ContainerImporter.java
Show resolved
Hide resolved
77a46c7 to
96e8afd
Compare
peterxcli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Gargi-jais11 for the prompt update. Haven't looked into the test code, but the prod code looks good.
Left some comments/suggestions/questions, please take a look.
...r-service/src/main/java/org/apache/hadoop/ozone/container/replication/ContainerImporter.java
Outdated
Show resolved
Hide resolved
...r-service/src/main/java/org/apache/hadoop/ozone/container/replication/ContainerImporter.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECMisReplicationHandler.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/ozone/container/replication/DownloadAndImportReplicator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/ozone/container/replication/SendContainerRequestHandler.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/ozone/container/replication/SendContainerRequestHandler.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/ozone/container/replication/SendContainerRequestHandler.java
Outdated
Show resolved
Hide resolved
|
please request my review whenever you think this is ready :) |
@peterxcli you can review the patch whenever you are free. |
peterxcli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the patch! Only some minor comments left.
Btw, could we add some IT coverage in TestContainerCoverage with sth like the below?
@ParameterizedTest
@EnumSource
void testPushWithReplicateSize(CopyContainerCompression compression) throws Exception {
final int index = compression.ordinal();
DatanodeDetails source = cluster.getHddsDatanodes().get(index).getDatanodeDetails();
long containerID = createNewClosedContainer(source);
DatanodeDetails target = selectOtherNode(source);
ReplicateContainerCommand cmd = ReplicateContainerCommand.toTarget(containerID, target);
cmd.setReplicateSize(2L * 1024 * 1024 * 1024); // example value
queueAndWaitForCompletion(cmd, source, ReplicationSupervisor::getReplicationSuccessCount);
}
@ParameterizedTest
@EnumSource
void testPullWithReplicateSize(CopyContainerCompression compression) throws Exception {
final int index = compression.ordinal();
DatanodeDetails target = cluster.getHddsDatanodes().get(index).getDatanodeDetails();
DatanodeDetails source = selectOtherNode(target);
long containerID = createNewClosedContainer(source);
ReplicateContainerCommand cmd =
ReplicateContainerCommand.fromSources(containerID, ImmutableList.of(source));
cmd.setReplicateSize(2L * 1024 * 1024 * 1024);
queueAndWaitForCompletion(cmd, target, ReplicationSupervisor::getReplicationSuccessCount);
}I guess the fallback path has been cover by the old tests in TestContainerReplication? so "with replicate size" tests should be enough.
...src/main/java/org/apache/hadoop/ozone/container/replication/SendContainerRequestHandler.java
Show resolved
Hide resolved
...test/java/org/apache/hadoop/ozone/container/replication/TestSendContainerRequestHandler.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/hadoop/ozone/container/replication/DownloadAndImportReplicator.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/hadoop/ozone/container/replication/TestDownloadAndImportReplicator.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/hadoop/ozone/container/replication/TestSendContainerRequestHandler.java
Outdated
Show resolved
Hide resolved
|
Then I think this is good to merge! cc @ChenSammi @siddhantsangwan Would you like to take another look? |
Do you mean adding in |
yes... sorry for the typo and thanks for the correction... |
No issues, it happens. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason for sending the container's size from the SCM to the Datanode? It's simpler to just get the size from the Datanode's in-memory state.
But I think how can we get size on Datanode side while choosing volume in the below place, so its better to send replicate size from SCM. Since in chhoseNextVolume we need to pass size in order to reserve space. Lines 77 to 82 in 554de4f
|
We don't need to make changes to pull replication ( Moreover, datanode knows the correct size of the container. SCM's knowledge of the container's size is outdated if there have been block deletions and the size has reduced. |
Thank you @siddhantsangwan for this information. I will do changes according to push replication. |
|
Thanks @siddhantsangwan for the explanation!
@Gargi-jais11 I think we almost there, we just need to
|
hadoop-hdds/interface-client/src/main/proto/DatanodeClientProtocol.proto
Outdated
Show resolved
Hide resolved
hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto
Outdated
Show resolved
Hide resolved
Another way is to get the container's size in ReplicateContainerCommandHandler#handle and use a setter to set that in |
…push replicator only
68759d4 to
80a84e3
Compare
80a84e3 to
e6c799e
Compare
Sorry, but I didn't saw this comment before changing the code. @siddhantsangwan and @peterxcli . |
|
Here is an analysis of both the approaches to get container Size to Push Replicator.
Now considering Test analysis of this approach :- |
So this approach adds 3 steps extra which can be avoided. And for this we need to add multiple test cases for PushReplicator class as well. |
|
@siddhantsangwan and @peterxcli |
I agree, based on this we can go ahead with the first approach. |
|
@Gargi-jais11 it'd be good to also have some kind of integration testing (whichever way is the easiest) that ensures this change works as intended across two Datanodes. |
Okay sure. I will add IT. |
ee89ecf to
e0ef830
Compare
e0ef830 to
7e42b8b
Compare
|
@Gargi-jais11 please avoid force pushing when possible because with force push reviewers can't know what all has changed since the last review. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, just a few minor comments.
...r-service/src/main/java/org/apache/hadoop/ozone/container/replication/ContainerImporter.java
Outdated
Show resolved
Hide resolved
...rvice/src/main/java/org/apache/hadoop/ozone/container/replication/GrpcContainerUploader.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/hadoop/ozone/container/replication/TestSendContainerRequestHandler.java
Outdated
Show resolved
Hide resolved
...st/src/test/java/org/apache/hadoop/ozone/container/replication/TestContainerReplication.java
Show resolved
Hide resolved
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending green CI.
What changes were proposed in this pull request?
Generally, the tar zip file of container will have smaller size than it's real size, so reserve a 2* max container size is conservative enough in most cases. But there are container over-allocated case, where container size can be double or triple the max container size as @siddhantsangwan saw in user's environment, have a accurate container size can handle this case welly, and that info could be brought in
For backward compatibility, if that field unset, fallback to get the container size from config.
comments:
#8360 (comment)
#8360 (comment)
#8360 (comment)
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-12998
How was this patch tested?
Updated existing UT.
TestPushReplicatorTestSendContainerRequestHandlerTestDownloadAndImportReplicator