Skip to content

Conversation

@JacksonYao287
Copy link
Contributor

@JacksonYao287 JacksonYao287 commented May 24, 2021

What changes were proposed in this pull request?

Support start/stop for container balancer via command line

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4926

How was this patch tested?

unit and acceptance tests

@JacksonYao287
Copy link
Contributor Author

@lokeshj1703 PTAL

/**
* Start ContainerBalancer.
*/
void startContainerBalancer() throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to our design decision, should we add the parameters like threshold and consecutiveIdleIterations in the start ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GlenGeng Thanks for the review! will add these and some usage info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems these params are not passed from these RPCs in StorageContainerLocationProtocol ?

Copy link
Contributor Author

@JacksonYao287 JacksonYao287 May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes , i keep it like this , because for now we can not determine how these parameters will be used or should we add more parameters(like bandwidth, which determines the maximum speed at which a block will be moved from one datanode to another ). I just implement the simplest function for now, and i think it will not be hard to complete the full implementation of this after we review the design doc and make the final decision.

public void execute(ScmClient scmClient) throws IOException {
    scmClient.startContainerBalancer();
    LOG.info("Starting ContainerBalancer...");
    //remove these in later implementation
    LOG.info("threshold: {}", threshold);
    LOG.info("idleiterations: {}", iCount);
  }

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JacksonYao287 Thanks for working on this! The changes look good to me. Can we add UT or acceptance tests for the new API/commands?

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look great @JacksonYao287 Can you please address my comments related to configs?


// initialise configs
this.config = balancerConfiguration;
//this.config = balancerConfiguration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContainerBalancerConfiguration can be updated with any CLI params and then used as a single source of configurations wherever needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siddhantsangwan Thanks for the review! we can just specify some important parameters for now I think and then keep on adding params as required . we can add more params as we move forward with the implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these parameters are required in multiple classes, we can set these fields in the ContainerBalancerConfiguration class and then pass this object to wherever parameters are required. This way, all parameters will be present in one place instead of being get/set in different places. It could be better to maintain all the parameters in a single class. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use K-V map to hold all parameters here, to avoid API change in future in case introducting new parameters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary blank line can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks , will remove it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove these extra head and tail black line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we using KeyValue here if there will be more fields in future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all these required fileds have default values? If so, they are optional, insteal of required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ChenSammi for the reivew. there will not be default value, see this discuss

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ChenSammi is pointing out that since all these fields do not have default values they should be optional.

@JacksonYao287
Copy link
Contributor Author

@adoroszlai Hey,can you help having a look at this CI problem TestReportPublisher.testCRLStatusReportPublisher? it seems nothing to do with my PR, i do not modify any CRL related code.

@adoroszlai
Copy link
Contributor

Hey,can you help having a look at this CI problem TestReportPublisher.testCRLStatusReportPublisher? it seems nothing to do with my PR, i do not modify any CRL related code.

Yes, it's unrelated. #2302 fixes it. Please wait for that to be merged first.

@JacksonYao287
Copy link
Contributor Author

thanks @adoroszlai for looking at this!

Yes, it's unrelated. #2302 fixes it. Please wait for that to be merged first.

will wait for that!

merge the lastest master branch to get a test bug fix : see HDDS-5295
@JacksonYao287 JacksonYao287 requested a review from lokeshj1703 June 3, 2021 12:38
Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JacksonYao287 Thanks for updating! The changes look great. I have few comments inline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ChenSammi is pointing out that since all these fields do not have default values they should be optional.

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JacksonYao287 Thanks for updating the PR! The changes look good to me. +1.

@lokeshj1703
Copy link
Contributor

@GlenGeng @ChenSammi @siddhantsangwan Do you have any other comments?

@JacksonYao287 JacksonYao287 requested a review from ChenSammi June 22, 2021 15:41
* Start ContainerBalancer.
*/
boolean startContainerBalancer(Optional<Double> threshold,
Optional<Integer> idleiterations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use 4 black space indent when a line wraps.

boolean startContainerBalancer(Optional<Double> threshold,
Optional<Integer> idleiterations,
Optional<Integer> maxDatanodesToBalance,
Optional<Long> maxSizeToMoveInGB) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@Override
public boolean getContainerBalancerStatus() throws IOException {

ContainerBalancerStatusRequestProto request =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems i have already used 4 black space indent here

lock.lock();
try {
if (!balancerRunning.compareAndSet(false, true)) {
LOG.error("Container Balancer is already running.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error -> info

return false;
}

ozoneConfiguration = new OzoneConfiguration();
Copy link
Contributor

@ChenSammi ChenSammi Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ozoneConfiguration seems not be used later.

@JacksonYao287
Copy link
Contributor Author

thanks @ChenSammi for the review, i have fix the comment.PTAL!

"by Container Balancer.")
private long maxSizeToMove = 10 * OzoneConsts.GB;

@Config(key = "idle.iterations", type = ConfigType.INT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on current implementation, I would suggest changing the "idle.iterations" to "max.iterations". As Lokesh said, we can do it in a seperate PR later to revisit all these configurations and namings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, we could revisit it later

Optional<Integer> maxDatanodesToBalance,
Optional<Long> maxSizeToMoveInGB) throws IOException{
getScm().checkAdminAccess(getRemoteUser());
AUDIT.logWriteSuccess(buildAuditMessageForSuccess(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should write the audit log after the balancer is started at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, that make sense , i will do it

@ChenSammi
Copy link
Contributor

Thanks @JacksonYao287 for working on this. The path LGTM, only several small issues.

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few minor comments, please take a look.

if (idleiterations.isPresent()) {
int idi = idleiterations.get();
Preconditions.checkState(idi > 0 || idi == -1,
"maxDatanodesToBalance must be positive or" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean:

Suggested change
"maxDatanodesToBalance must be positive or" +
"idleIterations must be positive or" +

if (idleiterations.isPresent()) {
int idi = idleiterations.get();
Preconditions.checkState(idi > 0 || idi == -1,
"maxDatanodesToBalance must be positive or" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean:

Suggested change
"maxDatanodesToBalance must be positive or" +
"idleIterations must be positive or" +

* @param count a idle iteration count larger than 0
*/
public void setIdleIteration(int count) {
if (count < 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we also allowing idleIterations to be -1? In that case this condition needs to be changed.

public void containerBalancerShouldStopWhenMaxDatanodesToBalanceIsReached() {
balancerConfiguration.setMaxDatanodesToBalance(2);
balancerConfiguration.setThreshold(0);
balancerConfiguration.setThreshold(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is threshold set to 1 here? A threshold of 1 means balancing is not required, which is incorrect for this test since we are checking the maxDatanodesToBalance configuration. Moreover threshold belongs to [0, 1).

@JacksonYao287
Copy link
Contributor Author

Thank @siddhantsangwan for the review, i have fixed the comment, PTAL!

@ChenSammi
Copy link
Contributor

The last patch LGTM, +1. Thanks @JacksonYao287 for the contribution and @lokeshj1703 @siddhantsangwan for code review.

@ChenSammi ChenSammi merged commit dc02775 into apache:master Jun 23, 2021
errose28 added a commit to errose28/ozone that referenced this pull request Jul 7, 2021
* master: (28 commits)
  HDDS-5332. Add a new column family and a service provider in Recon DB for Namespace Summaries (apache#2366)
  HDDS-5405. Refactor pom files for HadoopRpc and Grpc/Ratis compilation properties. (apache#2386)
  HDDS-5406. add proto version to all the proto files. (apache#2385)
  HDDS-5398. Avoid object creation in ReplicationManger debug log statements (apache#2379)
  HDDS-5396. Fix negligence issue conditional expressions in MockCRLStore.java (apache#2380)
  HDDS-5395. Avoid unnecessary numKeyOps.incr() call in OMMetrics (apache#2374)
  HDDS-5389. Include ozoneserviceid in fs.defaultFS when configuring o3fs (apache#2370)
  HDDS-5383. Eliminate expensive string creation in debug log messages (apache#2372)
  HDDS-5380. Get more accurate space info for DedicatedDiskSpaceUsage. (apache#2365)
  HDDS-5341. Container report processing is single threaded (apache#2338)
  HDDS-5387. ProfileServlet to move the default output location to an ozone specific directory (apache#2368)
  HDDS-5289. Update container's deleteTransactionId on creation of the transaction in SCM. (apache#2361)
  HDDS-5369. Cleanup unused configuration related to SCM HA (apache#2359)
  HDDS-5381. SCM terminated with exit status 1: null. (apache#2362)
  HDDS-5353. Avoid unnecessary executeBatch call in insertAudits (apache#2342)
  HDDS-5350 :  Add allocate block support in MockOmTransport (apache#2341). Contributed by Uma Maheswara Rao G.
  HDDS-4926. Support start/stop for container balancer via command line (apache#2278)
  HDDS-5269. Datandoe with low ratis log volume space should not be considered for new pipeline allocation. (apache#2344)
  HDDS-5367. Update modification time when updating quota/storageType/versioning (apache#2355)
  HDDS-5352. java.lang.ClassNotFoundException: org/eclipse/jetty/alpn/ALPN (apache#2347)
  ...
@JacksonYao287 JacksonYao287 deleted the HDDS-4926 branch July 20, 2021 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants