-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-5602. make it configurable to choose the nearest one as the target in the candidates according to networkTopology #2756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
603571d to
d701b2a
Compare
|
@lokeshj1703 @siddhantsangwan PTAL, thanks! |
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for working on this! The changes look good to me.
Can we add a UT if it is not a lot of effort?
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/FindTargetGreedy.java
Outdated
Show resolved
Hide resolved
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 It might be better to create another implementation of FindTargetStrategy that sorts potential targets by distance. FindTargetGreedy is working on a list of potential targets sorted from least utilized to most utilized.
|
@siddhantsangwan thanks for the review! what about adding a boolean parameter to |
d701b2a to
7235e97
Compare
|
@siddhantsangwan @lokeshj1703 i have refactored the code , please take a look! |
|
I would recommend using an Interface. @JacksonYao287 @siddhantsangwan Let's first finalise how the interface should look like and how to integrate it with balancer. Please see if we would like to use existing selection criteria/strategy. |
7235e97 to
86a6b33
Compare
86a6b33 to
b517b7e
Compare
1204cda to
8a595c7
Compare
|
@lokeshj1703 @siddhantsangwan , i have refactored the code again in the latest commit , please take a look! |
335c927 to
babfd61
Compare
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for working on this! I think it is better to have a separated ContainerBalancerSelectionCriteria as it gives flexibility later to account both source and target limitations in the criteria. With the current changes it will become part of FindSourceStrategy itself.
...rver-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/FindSourceStrategy.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR. Since we are doing a poll here, the source datanode will not be considered again in the iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please check if this is an issue? We can create a new jira to address it if it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not an issue. this happens when i tried to refactor the findTargetStrategy and remove ContainerBalancerSelectionCriteria. i have canceled this refactor, so getNextCandidateSourceDataNode is public now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that we are doing a poll in potentialSources. As a result source datanode will be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this source is added back here, and that method is called here. It seems correct but the logic is a bit confusing (#2808 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private void incSizeSelectedForMoving(DatanodeDetails source,
ContainerMoveSelection moveSelection) {
````````
// update sizeLeavingNode map with the recent moveSelection
findSourceStrategy.increaseSizeLeaving(source, size);
// update sizeEnteringNode map with the recent moveSelection
findTargetStrategy.increaseSizeEntering(target, size);
}
FindSourceGreedy#increaseSizeLeaving
public void increaseSizeLeaving(DatanodeDetails dui, long size) {
Long currentSize = sizeLeavingNode.get(dui);
if(currentSize != null) {
sizeLeavingNode.put(dui, currentSize + size);
//reorder according to the latest sizeLeavingNode
potentialSources.add(nodeManager.getUsageInfo(dui));
return;
}
LOG.warn("Cannot find datanode {} in candidate source datanodes",
dui.getUuid());
}
if we can find a target and a container for this source datanode, then incSizeSelectedForMoving will be definitely called , so this source data node will be added back to that priority queue and re-sort again.
if we can not find any target and container for this source datanode , what we should do is just removing it.
so i don`t think this is an issue. does this explanation make sense to you?
|
thanks @lokeshj1703 very much for the review!
for now , the only role of ContainerBalancerSelectionCriteria is to get Candidate Containers. but i agree with you. may be later in the future, we need to do some flexible work in ContainerBalancerSelectionCriteria. so let us keep it for now and i will cancel this refactor. so in this patch , i will only do the network topology related work. so i want to hear your opinion, should we sorting candidate targets by network first, or make it configurable? |
|
I think configurable is better. It would be preferable to choose most under utilised nodes first in cases like addition of new nodes or highly unbalanced cluster. |
4ec207a to
ebf730f
Compare
|
@lokeshj1703 i have updated this patch according to the comments , PTAL! |
80b3eec to
234d764
Compare
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for working on this! The changes look good to me. Please find my comments below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting would happen every time function is called. I think we can optimise for same source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, will do this optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after a deep thought, i think we have to sort potentialTargets every time , even for same source. after a certain target is selected and a move option is scheduled to it, sizeEntering of it will increase, and thus the utilization will increase. so when choosing a target for even the same source, if two candidate target has the same network topology distance to the source , the priority may change according to current usageinfo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the target and re-add it I think. But we can do this in a separate PR.
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/FindTargetGreedy.java
Outdated
Show resolved
Hide resolved
|
@lokeshj1703 thanks for the review! i have update this patch according to you comments , PTAL! |
…et in the candidates according to networkTopology
6b928e1 to
e48a50e
Compare
|
@lokeshj1703 @siddhantsangwan i have added test case. could you help reviewing this patch? |
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JacksonYao287 Thanks for updating the PR! The changes look good to me. +1.
|
@JacksonYao287 Thanks for the contribution! @siddhantsangwan Thanks for review! I have committed the PR to master branch. |
|
thanks @lokeshj1703 @siddhantsangwan for the review! |
What changes were proposed in this pull request?
always choose the nearest one as the target in the candidates according to networkTopology
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-5602
How was this patch tested?
ut