Skip to content

Conversation

@xichen01
Copy link
Contributor

What changes were proposed in this pull request?

Improving EC perf by eliminating redundant config reads.

The Root cause

The conf.getObject(OzoneClientConfig.class) is an expensive operation, because the for in injectConfigurationToObject may need to be looped many times and there will be a lot of configuration parsing operations.

for (Field field : configurationClass.getDeclaredFields()) {
if (field.isAnnotationPresent(Config.class)) {
checkNotFinal(configurationClass, field);
Config configAnnotation = field.getAnnotation(Config.class);
if (reconfiguration && !configAnnotation.reconfigurable()) {
continue;
}
String key = prefix + "." + configAnnotation.key();

  • There may be other places in Ozone where there are similar performance issues, because getObject is used heavily.
  • Maybe this can be solved by caching the computed configurations

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9913

How was this patch tested?

existing test
Performance testing result
https://issues.apache.org/jira/browse/HDDS-9911

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xichen01 for the patch.

.setIsMultipartKey(true)
.enableUnsafeByteBufferConversion(unsafeByteBufferConversion)
.setConfig(conf.getObject(OzoneClientConfig.class))
.setConfig(clientConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one, and the other two similar items are creating new configs intentionally. Please see HDDS-8216.

To reduce config loading, but still keep config loading to minimum, we could cache and reuse OzoneClientConfig instance per EC chunk size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but here we are using Configuration to pass parameters, which shouldn't be the role of Configuration. If a parameter needs to be computed at runtime, it should not be directly obtained from Configuration. Therefore, I added a StreamBufferArgs to modify and pass the StreamBuffe related parameters .

IMO, Configuration should ideally not provide a direct set method. Modifications to Configuration should only occur in "test code" and during "reconfiguration". Otherwise, Configuration should be read-only.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xichen01 for updating the patch. Creating an object for buffer args is a good idea.

@adoroszlai adoroszlai changed the title HDDS-9913. EC client Reduces duplicate loading of configurations HDDS-9913. Reduce number of times configuration is loaded in Ozone client Dec 16, 2023
@adoroszlai adoroszlai merged commit c990899 into apache:master Dec 16, 2023
VarshaRaviCV pushed a commit to VarshaRaviCV/ozone that referenced this pull request Dec 18, 2023
adoroszlai pushed a commit to adoroszlai/ozone that referenced this pull request Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants