Skip to content

Conversation

@Montura
Copy link
Contributor

@Montura Montura commented Dec 11, 2023

Dynamical adaptation (introduced in HDDS-5526) for datanodes.involved.max.percentage.per.iteration in container balancer doesn't work well in some cases.

Sometimes the number of under-utilized nodes may not be sufficient to satisfy the limit about the max percent of datanodes participating in the balance iteration (datanodes.involved.max.percentage.per.iteration). Thus, collections of source and target datanodes are reset and balancing is skipped (see comment).

The issue it can be easily detected when cluster has few nodes (< 10), for example 4 or 5. To fix this case we have to set datanodes.involved.max.percentage.per.iteration value to 100.

@siddhantsangwan wrote a small documentaion with some facing details about container balancer.

What changes were proposed in this pull request?

Introduced TestableCluster class to reuse it in tests for clusters with the different numbers of of datanodes.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9889

How was this patch tested?

hdds.scm.container.balancer.TestContainerBalancerTask is reworked:

  1. Extracted two classes
  • hdds.scm.container.balancer.MockedSCM for setting up testable hdds.scm.server.StorageContainerManager
  • hdds.scm.container.balancer.TestableCluster for creating test cluster with a required number of datanodes
  1. Add TestContainerBalancerDatanodeNodeLimit test with 3 tests extracted from TestContainerBalancerTask with cluster with different node count.

@Montura Montura changed the title Working on parametrized tests to run on clusters with different datan… HDDS-9889. Configure adaptation for datanode limits in ContainerBalancer Dec 11, 2023
@Montura
Copy link
Contributor Author

Montura commented Dec 11, 2023

@JacksonYao287 , @sumitagrawl, please review the changes

@adoroszlai
Copy link
Contributor

@Montura please merge latest master into your branch, the compile error (not caused by this PR) is fixed in 582a5ce

@Montura
Copy link
Contributor Author

Montura commented Dec 11, 2023

@adoroszlai, UPD: done!

@Montura Montura force-pushed the amikhalev/datanode_limits branch 4 times, most recently from eed7a9a to f8dc3fc Compare December 15, 2023 08:20
@Montura
Copy link
Contributor Author

Montura commented Dec 15, 2023

UPD: Today I rebased this PR on the master branch (to get the latest changes)

@adoroszlai
Copy link
Contributor

UPD: Today I rebased this PR on the master branch (to get the latest changes)

Thanks @Montura. Only need to update from master if there is a conflict, or failing checks need code from master. Also, please use merge, not rebase.

@siddhantsangwan @sumitagrawl can you please review?

@Montura Montura force-pushed the amikhalev/datanode_limits branch from f8dc3fc to 5d175ae Compare December 15, 2023 15:45
@Montura
Copy link
Contributor Author

Montura commented Dec 20, 2023

@siddhantsangwan @sumitagrawl could you please review?

@Montura
Copy link
Contributor Author

Montura commented Dec 21, 2023

UPD: Today I merged master branch in this PR to resolve conflicts

@Montura
Copy link
Contributor Author

Montura commented Dec 26, 2023

@siddhantsangwan @sumitagrawl could you please review?

@adoroszlai
Copy link
Contributor

@Montura please keep in mind that end of year is usually a holiday season in many places

@adoroszlai
Copy link
Contributor

@Montura Sorry about the code conflicts. This PR does not allow edit from maintainers, is that intentional? If it did, I'd try to keep it updated after merging PRs touching the same files.

@adoroszlai
Copy link
Contributor

@siddhantsangwan @sumitagrawl please take a look at the patch, to provide high-level feedback until conflicts are resolved

@Montura
Copy link
Contributor Author

Montura commented Jan 8, 2024

@Montura Sorry about the code conflicts. This PR does not allow edit from maintainers, is that intentional? If it did, I'd try to keep it updated after merging PRs touching the same files.

Tomorrow I'll resolve conflicts, it wasn't intentional to forbid PR editing for maintainers. It's my first PR here, I'll do better next time.

@adoroszlai
Copy link
Contributor

it wasn't intentional to forbid PR editing for maintainers. It's my first PR here, I'll do better next time.

No worries.

@siddhantsangwan
Copy link
Contributor

@Montura Thanks for working on this. I'm trying to understand the problem. If you wrote a test that fails without your fix, please point me to it.

@Montura
Copy link
Contributor Author

Montura commented Jan 9, 2024

@Montura Thanks for working on this. I'm trying to understand the problem. If you wrote a test that fails without your fix, please point me to it.

Sure, I'm merging current master now, when I finish, I'll point out to the test

@siddhantsangwan
Copy link
Contributor

Sometimes the number of under-utilized nodes may not be sufficient to satisfy the limit about the max percent of datanodes participating in the balance iteration (datanodes.involved.max.percentage.per.iteration). Thus, collections of source and target datanodes are reset and balancing is skipped.

I didn't really get this. Can you please elaborate? It'd be helpful to have a small example where you describe this problem.

@Montura
Copy link
Contributor Author

Montura commented Jan 9, 2024

Sometimes the number of under-utilized nodes may not be sufficient to satisfy the limit about the max percent of datanodes participating in the balance iteration (datanodes.involved.max.percentage.per.iteration). Thus, collections of source and target datanodes are reset and balancing is skipped.

I didn't really get this. Can you please elaborate? It'd be helpful to have a small example where you describe this problem.

Let's imaging that you have a cluster with the total DNs number equals to any value of [4, 9] (4 or 5 or 6 or 7 or 8 or 9).

Then the maximum value of DN's that could be involved in balancing for that clusters will be 1, because default value for maxDatanodesRatioToInvolvePerIteration is 0.2 (20 %). So in the next two methods will skip balancing when DNs count is less then 10.

// ContainerBalancerTask#adaptWhenNearingIterationLimits
int maxDatanodesToInvolve =  config.getMaxDatanodesRatioToInvolvePerIteration() * totalNodesInCluster;
if (countDatanodesInvolvedPerIteration + 1 == maxDatanodesToInvolve) {
    // Restricts potential target datanodes to nodes that have already been selected
}


// ContainerBalancerTask#adaptOnReachingIterationLimits
int maxDatanodesToInvolve =  config.getMaxDatanodesRatioToInvolvePerIteration() * totalNodesInCluster;
if (countDatanodesInvolvedPerIteration  == maxDatanodesToInvolve) {
    // Restricts potential source and target datanodes to nodes that have already been selected
}

// 4  * 0.2  = (0.8 -> cast_to_int) 0;
// 5  * 0.2  = (1.0 -> cast_to_int) 1;
// 6  * 0.2  = (1.2 -> cast_to_int) 1;
// 7  * 0.2  = (1.4 -> cast_to_int) 1;
// 8  * 0.2  = (1.6 -> cast_to_int) 1;
// 9  * 0.2  = (1.8 -> cast_to_int) 1;
// 10 * 0.2  = (2.0 -> cast_to_int) 2;

By spec java primitive narrowing conversion, the floating-point value is rounded to an integer value V, rounding toward zero using IEEE 754 round-toward-zero mode (§4.2.3)

The Java programming language uses round toward zero when converting a floating value to an integer (§5.1.3), which acts, in this case, as though the number were truncated, discarding the mantissa bits. Rounding toward zero chooses as its result the format's value closest to and no greater in magnitude than the infinitely precise result.

So we got under-utilized nodes that will never take part in balancing at all. All clusters with DNs count > 3 and < 10 will start balancing and do nothing because of action in ContainerBalancerTask#adaptWhenNearingIterationLimits method

@siddhantsangwan
Copy link
Contributor

Ah, I understand. Yes, I've seen this happen in some small clusters. The recommendation is to increase the value of datanodes.involved.max.percentage.per.iteration accordingly. For example, it can be set to 100 for clusters of 15 Datanodes or less so that all Datanodes may be involved in balancing. Do you have any reason to not do this and make a code change instead? It doesn't make sense to have a configuration datanodes.involved.max.percentage.per.iteration which imposes a limit, and then have another configuration adapt.balance.when.reach.the.limit which will effectively disable the former limit. Why not just change datanodes.involved.max.percentage.per.iteration?

@Montura
Copy link
Contributor Author

Montura commented Jan 9, 2024

Ok, make it sense.

Let me rewrite the tests to verify desired behavior by increase the value of datanodes.involved.max.percentage.per.iteration. And I revert the changes about properties in hdds.scm.container.balancer.ContainerBalancerConfiguration.

What do you think?

UPD: I've updated PR with using datanodes.involved.max.percentage.per.iteration property in containerBalancerShouldObeyMaxDatanodesToInvolveLimit test

@Montura Montura force-pushed the amikhalev/datanode_limits branch 2 times, most recently from 0332673 to 0239124 Compare January 9, 2024 12:12
@Montura Montura force-pushed the amikhalev/datanode_limits branch from 1c19524 to fdb7ed6 Compare April 9, 2024 13:21
@Montura Montura force-pushed the amikhalev/datanode_limits branch from fdb7ed6 to 89bd705 Compare April 9, 2024 13:24
@Montura Montura force-pushed the amikhalev/datanode_limits branch from 492863c to 1f126d1 Compare April 10, 2024 10:29
@Montura
Copy link
Contributor Author

Montura commented Apr 10, 2024

@siddhantsangwan, I applied all your suggestions. Please, look at the PR once again

@adoroszlai adoroszlai changed the title HDDS-9889. Refatoring tests related to dynamical adaptation for datanode limits in ContainerBalancer HDDS-9889. Refactor tests related to dynamical adaptation for datanode limits in ContainerBalancer Apr 10, 2024
@Montura Montura requested a review from siddhantsangwan April 22, 2024 14:19
Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Montura Thanks for the update. LGTM.

@siddhantsangwan
Copy link
Contributor

@Montura can you push an empty commit with no changes, and the commit message saying "Trigger CI"?
Github is showing me:

Unable to re-run one or more workflows. Check if the workflows are already running, are more than 30 days old, or are disabled.

for 1 workflow. @adoroszlai any idea?

@Montura
Copy link
Contributor Author

Montura commented Apr 30, 2024

@Montura can you push an empty commit with no changes, and the commit message saying "Trigger CI"? Github is showing me:

Unable to re-run one or more workflows. Check if the workflows are already running, are more than 30 days old, or are disabled.

for 1 workflow. @adoroszlai any idea?

Empty commit is disabled, any changes required. Let's wait for @adoroszlai

@adoroszlai
Copy link
Contributor

Empty commit is disabled

git commit --allow-empty

@Montura
Copy link
Contributor Author

Montura commented May 1, 2024

Empty commit is disabled

git commit --allow-empty

Done

@Montura
Copy link
Contributor Author

Montura commented May 1, 2024

Merge please

@adoroszlai adoroszlai merged commit 78a7e7a into apache:master May 1, 2024
@adoroszlai
Copy link
Contributor

Thanks @Montura for continued efforts on this. Thanks @siddhantsangwan for the review.

@Montura Montura deleted the amikhalev/datanode_limits branch May 6, 2024 06:30
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
…e limits in ContainerBalancer (apache#5758)

(cherry picked from commit 78a7e7a)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 16, 2024
…e limits in ContainerBalancer (apache#5758)

(cherry picked from commit 78a7e7a)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 18, 2024
…e limits in ContainerBalancer (apache#5758)

(cherry picked from commit 78a7e7a)
vtutrinov pushed a commit to vtutrinov/ozone that referenced this pull request Jul 15, 2025
…e limits in ContainerBalancer (apache#5758)

(cherry picked from 78a7e7a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants