-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-6244. ContainerBalancer metrics don't show updated values in JMX #3049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@lokeshj1703 @JacksonYao287 please take a look! |
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siddhantsangwan Thanks for working on this! The changes look good to me. I have a few comments inline.
...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java
Show resolved
Hide resolved
...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java
Outdated
Show resolved
Hide resolved
...r-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancer.java
Outdated
Show resolved
Hide resolved
# Conflicts: # hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancer.java
|
@lokeshj1703 I've replaced the current metrics and included more testing. |
JacksonYao287
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @siddhantsangwan for the work ! the change looks good.
I also suggest that we can add aggregate metrics(total containers and total datasize). but this can be implemented in a seperate jira.
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @siddhantsangwan for updating the PR! I have some minor comments.
...cm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerMetrics.java
Show resolved
Hide resolved
...r-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancer.java
Outdated
Show resolved
Hide resolved
...r-scm/src/test/java/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancer.java
Outdated
Show resolved
Hide resolved
|
Thanks for reviewing @lokeshj1703 @JacksonYao287 |
lokeshj1703
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the PR! The changes look good to me. +1.
I have one minor comment, let me know if you would like to address it in this or new PR.
| private MutableCounterLong dataSizeUnbalancedGB; | ||
|
|
||
| @Metric(about = "Number of unbalanced datanodes.") | ||
| private MutableCounterLong datanodesNumUnbalanced; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will also recommend changing names of other metrics. Like datanodesNum -> numDatanodes and movedContainersNum -> numMovedContainers.
We can do it in a separate PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I remember we had decided on this naming convention in a previous PR: #2230 (comment)
But this does conflict with numIterations where "num" is the prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. My idea is that it should be easy to search these metrics in prometheus. Maybe we need to have common prefix for balancer then. Can we check some naming guide here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into some examples. We've used "num" as a prefix in other places, such as numCacheHits, numCacheMisses, numInFlightReplications, numWriteStateMachineOps etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lokeshj1703 I've fixed the naming.
|
@siddhantsangwan Thanks for the contribution! @JacksonYao287 Thanks for the reviews! I have committed the PR to master branch. |
What changes were proposed in this pull request?
The
ContainerBalancerMetricsclass is responsible for recording metrics. These metrics always show 0 when accessed through JMX, while expected values are greater than 0.I deleted some metrics of the type
MutableGaugeLongand instead used the typeMutableCounterLong. Introduced a new metriccountIterationsthat keeps count of the number of iterations that balancer has run for.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6244
How was this patch tested?
Updated TestContainerBalancer#testMetrics()