-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-4926. Support start/stop for container balancer via command line #2278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@lokeshj1703 PTAL |
| /** | ||
| * Start ContainerBalancer. | ||
| */ | ||
| void startContainerBalancer() throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to our design decision, should we add the parameters like threshold and consecutiveIdleIterations in the start ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GlenGeng Thanks for the review! will add these and some usage info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated. PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems these params are not passed from these RPCs in StorageContainerLocationProtocol ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes , i keep it like this , because for now we can not determine how these parameters will be used or should we add more parameters(like bandwidth, which determines the maximum speed at which a block will be moved from one datanode to another ). I just implement the simplest function for now, and i think it will not be hard to complete the full implementation of this after we review the design doc and make the final decision.
public void execute(ScmClient scmClient) throws IOException {
scmClient.startContainerBalancer();
LOG.info("Starting ContainerBalancer...");
//remove these in later implementation
LOG.info("threshold: {}", threshold);
LOG.info("idleiterations: {}", iCount);
}
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for working on this! The changes look good to me. Can we add UT or acceptance tests for the new API/commands?
...dds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/ContainerBalancerStartSubcommand.java
Outdated
Show resolved
Hide resolved
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look great @JacksonYao287 Can you please address my comments related to configs?
.../apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java
Show resolved
Hide resolved
|
|
||
| // initialise configs | ||
| this.config = balancerConfiguration; | ||
| //this.config = balancerConfiguration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContainerBalancerConfiguration can be updated with any CLI params and then used as a single source of configurations wherever needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siddhantsangwan Thanks for the review! we can just specify some important parameters for now I think and then keep on adding params as required . we can add more params as we move forward with the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these parameters are required in multiple classes, we can set these fields in the ContainerBalancerConfiguration class and then pass this object to wherever parameters are required. This way, all parameters will be present in one place instead of being get/set in different places. It could be better to maintain all the parameters in a single class. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use K-V map to hold all parameters here, to avoid API change in future in case introducting new parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary blank line can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks , will remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove these extra head and tail black line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we using KeyValue here if there will be more fields in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do all these required fileds have default values? If so, they are optional, insteal of required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ChenSammi for the reivew. there will not be default value, see this discuss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @ChenSammi is pointing out that since all these fields do not have default values they should be optional.
|
@adoroszlai Hey,can you help having a look at this CI problem |
Yes, it's unrelated. #2302 fixes it. Please wait for that to be merged first. |
|
thanks @adoroszlai for looking at this!
will wait for that! |
merge the lastest master branch to get a test bug fix : see HDDS-5295
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for updating! The changes look great. I have few comments inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @ChenSammi is pointing out that since all these fields do not have default values they should be optional.
...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
Outdated
Show resolved
Hide resolved
...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
Show resolved
Hide resolved
...r-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancer.java
Show resolved
Hide resolved
...dds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/ContainerBalancerStartSubcommand.java
Outdated
Show resolved
Hide resolved
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for updating the PR! The changes look good to me. +1.
|
@GlenGeng @ChenSammi @siddhantsangwan Do you have any other comments? |
| * Start ContainerBalancer. | ||
| */ | ||
| boolean startContainerBalancer(Optional<Double> threshold, | ||
| Optional<Integer> idleiterations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use 4 black space indent when a line wraps.
| boolean startContainerBalancer(Optional<Double> threshold, | ||
| Optional<Integer> idleiterations, | ||
| Optional<Integer> maxDatanodesToBalance, | ||
| Optional<Long> maxSizeToMoveInGB) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
| @Override | ||
| public boolean getContainerBalancerStatus() throws IOException { | ||
|
|
||
| ContainerBalancerStatusRequestProto request = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems i have already used 4 black space indent here
| lock.lock(); | ||
| try { | ||
| if (!balancerRunning.compareAndSet(false, true)) { | ||
| LOG.error("Container Balancer is already running."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error -> info
| return false; | ||
| } | ||
|
|
||
| ozoneConfiguration = new OzoneConfiguration(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ozoneConfiguration seems not be used later.
|
thanks @ChenSammi for the review, i have fix the comment.PTAL! |
| "by Container Balancer.") | ||
| private long maxSizeToMove = 10 * OzoneConsts.GB; | ||
|
|
||
| @Config(key = "idle.iterations", type = ConfigType.INT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on current implementation, I would suggest changing the "idle.iterations" to "max.iterations". As Lokesh said, we can do it in a seperate PR later to revisit all these configurations and namings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, we could revisit it later
| Optional<Integer> maxDatanodesToBalance, | ||
| Optional<Long> maxSizeToMoveInGB) throws IOException{ | ||
| getScm().checkAdminAccess(getRemoteUser()); | ||
| AUDIT.logWriteSuccess(buildAuditMessageForSuccess( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should write the audit log after the balancer is started at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, that make sense , i will do it
|
Thanks @JacksonYao287 for working on this. The path LGTM, only several small issues. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few minor comments, please take a look.
| if (idleiterations.isPresent()) { | ||
| int idi = idleiterations.get(); | ||
| Preconditions.checkState(idi > 0 || idi == -1, | ||
| "maxDatanodesToBalance must be positive or" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean:
| "maxDatanodesToBalance must be positive or" + | |
| "idleIterations must be positive or" + |
| if (idleiterations.isPresent()) { | ||
| int idi = idleiterations.get(); | ||
| Preconditions.checkState(idi > 0 || idi == -1, | ||
| "maxDatanodesToBalance must be positive or" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean:
| "maxDatanodesToBalance must be positive or" + | |
| "idleIterations must be positive or" + |
| * @param count a idle iteration count larger than 0 | ||
| */ | ||
| public void setIdleIteration(int count) { | ||
| if (count < 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we also allowing idleIterations to be -1? In that case this condition needs to be changed.
| public void containerBalancerShouldStopWhenMaxDatanodesToBalanceIsReached() { | ||
| balancerConfiguration.setMaxDatanodesToBalance(2); | ||
| balancerConfiguration.setThreshold(0); | ||
| balancerConfiguration.setThreshold(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is threshold set to 1 here? A threshold of 1 means balancing is not required, which is incorrect for this test since we are checking the maxDatanodesToBalance configuration. Moreover threshold belongs to [0, 1).
|
Thank @siddhantsangwan for the review, i have fixed the comment, PTAL! |
|
The last patch LGTM, +1. Thanks @JacksonYao287 for the contribution and @lokeshj1703 @siddhantsangwan for code review. |
* master: (28 commits) HDDS-5332. Add a new column family and a service provider in Recon DB for Namespace Summaries (apache#2366) HDDS-5405. Refactor pom files for HadoopRpc and Grpc/Ratis compilation properties. (apache#2386) HDDS-5406. add proto version to all the proto files. (apache#2385) HDDS-5398. Avoid object creation in ReplicationManger debug log statements (apache#2379) HDDS-5396. Fix negligence issue conditional expressions in MockCRLStore.java (apache#2380) HDDS-5395. Avoid unnecessary numKeyOps.incr() call in OMMetrics (apache#2374) HDDS-5389. Include ozoneserviceid in fs.defaultFS when configuring o3fs (apache#2370) HDDS-5383. Eliminate expensive string creation in debug log messages (apache#2372) HDDS-5380. Get more accurate space info for DedicatedDiskSpaceUsage. (apache#2365) HDDS-5341. Container report processing is single threaded (apache#2338) HDDS-5387. ProfileServlet to move the default output location to an ozone specific directory (apache#2368) HDDS-5289. Update container's deleteTransactionId on creation of the transaction in SCM. (apache#2361) HDDS-5369. Cleanup unused configuration related to SCM HA (apache#2359) HDDS-5381. SCM terminated with exit status 1: null. (apache#2362) HDDS-5353. Avoid unnecessary executeBatch call in insertAudits (apache#2342) HDDS-5350 : Add allocate block support in MockOmTransport (apache#2341). Contributed by Uma Maheswara Rao G. HDDS-4926. Support start/stop for container balancer via command line (apache#2278) HDDS-5269. Datandoe with low ratis log volume space should not be considered for new pipeline allocation. (apache#2344) HDDS-5367. Update modification time when updating quota/storageType/versioning (apache#2355) HDDS-5352. java.lang.ClassNotFoundException: org/eclipse/jetty/alpn/ALPN (apache#2347) ...
What changes were proposed in this pull request?
Support start/stop for container balancer via command line
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-4926
How was this patch tested?
unit and acceptance tests