-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-6070. ContainerBalancerConfig doesn't read config from ozone-site.xml #2893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@siddhantsangwan Could you help to review this PR? |
|
@symious Thanks for finding this bug. Looking into it. |
|
@siddhantsangwan Thank you for the review. |
| long size = (long) ozoneConfiguration.getStorageSize( | ||
| ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE, | ||
| ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE_DEFAULT, StorageUnit.GB) + | ||
| ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE_DEFAULT, StorageUnit.BYTES) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siddhantsangwan Another issue is the StorageUnit here.
The value from Unit.GB is "5 + 1GB", need to changed to Unit.BYTES.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was just going to raise a jira for this particular bug when I saw that you've pushed an update. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems the other configs, like "move.timeout" and "balancing.interval", also have the same issue.
|
I have one initial observation that could be a problem. The pattern of results in overriding the values of |
|
@siddhantsangwan Thanks for the review. Yes, and in fact the other initialize method seems a little odd. I added the method because the original ContainerBalancerConfiguration is constructed with OzoneConfiguration, which is to override those configurations on purpose. I think the root problem is that some values of ContainerBalancerConfiguration are dependent on other configurations, like "size.entering.target.max" is depending on "OZONE_SCM_CONTAINER_SIZE_DEFAULT", but the current ConfigGroup annotation doesn't support this requirement. |
JacksonYao287
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symious thanks for the work , the changes looks good !
i left a comment , please take a look
| this.config = ozoneConfiguration. | ||
| getObject(ContainerBalancerConfiguration.class); | ||
| config.initialize(ozoneConfiguration); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be we can add a static build function to ContainerBalancerConfiguration class to build and initialize a ContainerBalancerConfiguration instance, for example:
public static ContainerBalancerConfiguration buildFrom(ozoneConfiguration oc) {
.....
}
Yeah. I can think of the following solutions:
This seems similar to what @JacksonYao287 is suggesting.
|
|
@siddhantsangwan @JacksonYao287 Thanks for the reply. I think Option 1 is the original implementation. If we initialize ContainerBalancerConfiguration based on OzoneConfiguration, the @ConfigGroup pattern won't work, that is the content of ozone-site.xml won't be projected to ContainerBalancerConfiguration. AFAIK, we need to use ozoneConfiguration.getObject(ContainerBalancerConfiguration.class) to load the value from ozone-site.xml. I think Option 3 might be a good choice. We can let ContainerBalancerConfiguration only handle the configs from ozone-site.xml, the validation and check of configs can be handled when starting container balancer service.. |
|
Thanks @symious for working on this! I would like to add one more option here. |
I agree, it's the simplest. The default values could be:
Some very related changes are being made in #2892 |
|
@lokeshj1703 Thanks for the review.
The current implementation is inherited from ConfigurationSource as follows If we are going to override this method, it may look like "first try configurationClass.getDeclaredConstructor(OzoneConfiguration.class).newInstance(ozoneConfiguration), if not exists, then try configurationClass.newInstance()". Personally I think the implementation would look a little odd. |
|
@siddhantsangwan Sure, will update the default value in the next commit. |
|
@siddhantsangwan Update the commit. Could you help to check? Added validateConfiguration in ContainerBalancer, it can return boolean to indicate an incorrect configuration, but many changes in test cases need to be made, for now, I leave the method to be void. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look great @symious . I have a few minor comments.
| long size = (long) ozoneConfiguration.getStorageSize( | ||
| ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE, | ||
| ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE_DEFAULT, StorageUnit.BYTES) + | ||
| OzoneConsts.GB; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to add another GB here. The configs just need to be greater than size.
| ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE_DEFAULT, StorageUnit.BYTES) + | ||
| OzoneConsts.GB; | ||
|
|
||
| if (conf.getMaxSizeEnteringTarget() < size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the check if another GB isn't added to size (as commented above)
| if (conf.getMaxSizeEnteringTarget() < size) { | |
| if (conf.getMaxSizeEnteringTarget() <= size) { |
| LOG.info("MaxSizeEnteringTarget should be larger than " + | ||
| "ozone.scm.container.size"); | ||
| } | ||
| if (conf.getMaxSizeLeavingSource() < size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
| if (conf.getMaxSizeLeavingSource() < size) { | |
| if (conf.getMaxSizeLeavingSource() <= size) { |
| // balancing interval should be greater than DUFactory refresh period | ||
| DUFactory.Conf duConf = ozoneConfiguration.getObject(DUFactory.Conf.class); | ||
| long balancingInterval = duConf.getRefreshPeriod().toMillis() + | ||
| Duration.ofMinutes(10).toMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, don't need to add extra 10 minutes since the config only needs to be greater than DU refresh period.
| LOG.info("balancing.iteration.interval should be at lease 10 minutes " + | ||
| "larger than hdds.datanode.du.refresh.period."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| LOG.info("balancing.iteration.interval should be at lease 10 minutes " + | |
| "larger than hdds.datanode.du.refresh.period."); | |
| LOG.info("balancing.iteration.interval should be " + | |
| "larger than hdds.datanode.du.refresh.period."); |
| " by default.") | ||
| private String excludeNodes = ""; | ||
|
|
||
| private DUFactory.Conf duConf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't getting initialised now, we can remove this and its usages in the setBalancingInterval method.
Yes, having it return a boolean makes sense. And sure, we can add this change in a follow-up PR. |
|
@siddhantsangwan Updated the patch, please have a check.
Sure, will create a new ticket for this change. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update looks good to me.
|
@lokeshj1703 Can you please take another look? |
JacksonYao287
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @symious for updating this patch , i have two comments , please take a look
| Assert.assertEquals(cbConf.getThreshold(), 0.01d, DELTA); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove DELTA and use assertTrue(Doublu.compare(a,b) == 0)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed DELTA and use a similar comparison as
Line 76 in 8b4d4a9
| assertEquals(beforeCapacity.getStandardDeviation(), beforeRandom |
, please have a check.
| if (conf.getMaxSizeEnteringTarget() <= size) { | ||
| LOG.info("MaxSizeEnteringTarget should be larger than " + | ||
| "ozone.scm.container.size"); | ||
| } | ||
| if (conf.getMaxSizeLeavingSource() <= size) { | ||
| LOG.info("MaxSizeLeavingSource should be larger than " + | ||
| "ozone.scm.container.size"); | ||
| } | ||
|
|
||
| // balancing interval should be greater than DUFactory refresh period | ||
| DUFactory.Conf duConf = ozoneConfiguration.getObject(DUFactory.Conf.class); | ||
| long balancingInterval = duConf.getRefreshPeriod().toMillis(); | ||
| if (conf.getBalancingInterval().toMillis() <= balancingInterval) { | ||
| LOG.info("balancing.iteration.interval should be larger than " + | ||
| "hdds.datanode.du.refresh.period."); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when validating , if we find it is illegal(for example, conf.getMaxSizeEnteringTarget() <= size), should we throw an Exception and just return without starting balancer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was intended to return "false" for these situations, but it will break many test cases.
The changes will be brought up in a new ticket.
|
Will Merge tomorrow if no other comments |
|
@symious Thanks for your contribution. @siddhantsangwan @lokeshj1703 @JacksonYao287 Thanks for your review! Merged |
What changes were proposed in this pull request?
Issue happens when we try to update the config of "hdds.container.balancer.utilization.threshold", but the value from the SCM log said it's always the "0.1", which is the default value.
After checking the code, the construction of ContainerBalancerConfiguration doesn't comply with the pattern of "ConfigGroup", this ticket is to fix the issue to update the configuration from config file.
Since the class needs to hold OzoneConfiguration for configuration check, the method of "initialize" is created for this requirement.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6070
How was this patch tested?
unit test