-
Notifications
You must be signed in to change notification settings - Fork 589
HDDS-7648. Add a servername tag in UGI metrics. #4094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mladjan-gadzic for the patch.
hadoop-hdds/framework/pom.xml
Outdated
| <properties> | ||
| <ozone-common.version>1.3.0-SNAPSHOT</ozone-common.version> | ||
| </properties> | ||
|
|
||
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.ozone</groupId> | ||
| <artifactId>ozone-common</artifactId> | ||
| <version>${ozone-common.version}</version> | ||
| </dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not introduce backwards dependency (hdds -> ozone). If we want to access some code currently in ozone from both, we should move it to hdds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a need for org.apache.hadoop.ozone.om.OMConfigKeys. Where exactly do you propose it should be moved to make the most sense out of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.apache.hadoop.ozone.om package (om part yet to be created but at same level where org.apache.hadoop.hdds.scm.ScmConfigKeys resides) from common module from hadoop-hdds (hdds-common artifact id) sounds reasonable enough, but there are quite a few moving parts together with org.apache.hadoop.ozone.om.OMConfigKeys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adoroszlai Thanks for looking into this. I don't think there is a package where we can access both OMConfigKeys and ScmConfigKeys but we might be able to simplify the switch case that requires both of them. We present prometheus metrics only for SCM and OM. We can only check the ScmConfigKeys and if it doesn't match then we can assume the server is OM without checking OMConfigKeys.
This might not be the best approach. We have the hostname available. There might be a way to use that to get the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adoroszlai @xBis7 Thanks for the suggestions! Check for ScmConfigKeys is enough right now - it might not be the best future proof solution but it is still better than moving tons of classes just in order to use OMConfigKeys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a simple, more generic solution: PrometheusMetricsSink is created in BaseHttpServer, which has service-specific subclasses. So define a method in BaseHttpServer that returns the info you want to pass to PrometheusMetricsSink, then override in OzoneManagerHttpServer, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adoroszlai This might be exactly what I was looking for. Thanks for suggestion, will try to expand on it!
|
@xBis7 please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mladjan-gadzic Thanks for working on this. I left a few comments inline.
This code here should also be moved under PrometheusMetricsSinkUtil. PrometheusMetricsSink should call PrometheusMetricsSinkUtil methods and these methods in return should call any other methods that belong to DecayRpcSchedulerUtil, UgiMetricsUtil etc.
I'm proposing this refactoring in order to facilitate creating similar classes that might edit other metrics. We are currently editing DecayRpcScheduler and UGI metrics. There might be a future need to make similar changes to other metrics as well.
| * Util class for | ||
| * {@link org.apache.hadoop.hdds.server.http.PrometheusMetricsSink}. | ||
| */ | ||
| public final class PrometheusMetricsSinkUtil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class should be more generic and agnostic to any metrics specific operations. It would be best to have another final class called UgiMetricsUtil similar to DecayRpcSchedulerUtil and move the relevant methods under it.
Also, if it turns out that we have a dependency issue, it will be best to have the code that causes the issue in a new class and maybe move that class under another package.
| */ | ||
| public final class PrometheusMetricsSinkUtil { | ||
|
|
||
| private static final String UGI_METRICS = "ugi_metrics"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under UgiMetricsUtil
| public final class PrometheusMetricsSinkUtil { | ||
|
|
||
| private static final String UGI_METRICS = "ugi_metrics"; | ||
| private static final String SERVER_NAME = "unknown"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under UgiMetricsUtil
| * @param key metrics entry key | ||
| * @param metricTags list of metric tags | ||
| */ | ||
| private static void addServerNameTag(String key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under UgiMetricsUtil
| * @param username caller username | ||
| * @param metricTags list of metric tags | ||
| */ | ||
| private static void addUsernameTag(String username, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under DecayRpcSchedulerUtil
| * @param serverPort server port | ||
| * @return <tt>metricKey</tt> with replaced servername tag value. | ||
| */ | ||
| public static String replaceServerNameTagValue(String metricEntryKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under UgiMetricsUtil
| case OMConfigKeys.OZONE_OM_HTTP_BIND_PORT_DEFAULT: | ||
| serverName = "OM"; | ||
| break; | ||
| case ScmConfigKeys.OZONE_SCM_HTTP_BIND_PORT_DEFAULT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @adoroszlai pointed out, this should be ScmConfigKeys.OZONE_SCM_HTTP_ADDRESS_KEY and from there get the port, in order to cover the case of a custom port.
xBis7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mladjan-gadzic I left one comment inline. Apart from that LGTM. @adoroszlai Could you take another look?
| import org.junit.jupiter.api.Assertions; | ||
| import org.junit.jupiter.api.Test; | ||
|
|
||
| class UgiMetricsUtilTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be renamed to follow test naming conventions.
| class UgiMetricsUtilTest { | |
| class TestUgiMetricsUtil { |
| import org.apache.hadoop.metrics2.MetricsTag; | ||
| import org.junit.jupiter.api.Assertions; | ||
| import org.junit.jupiter.api.Test; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc comment missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xBis7 Thank you!
# Conflicts: # hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/server/http/PrometheusMetricsSink.java # hadoop-hdds/framework/src/test/java/org/apache/hadoop/hdds/server/http/TestPrometheusMetricsSink.java
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mladjan-gadzic for updating the patch.
|
Thanks @xBis7 for the reviews. |
* master: (176 commits) HDDS-7726. EC: Enhance datanode reconstruction log message (apache#4155) HDDS-7739. EC: Increase the information in the RM sending command log message (apache#4153) HDDS-7652. Volume Quota not enforced during write when bucket quota is not set (apache#4124) HDDS-7628. Intermittent failure in TestOzoneContainerWithTLS (apache#4142) HDDS-7695. EC metrics related to replication commands don't add up (apache#4152) HDDS-7729. EC: ECContainerReplicaCount should handle pending delete of unhealthy replicas (apache#4146) HDDS-7738. SCM terminates when adding container to a closed pipeline (apache#4154) HDDS-7243. Remove RequestFeatureValidator from echoRPC method which supports only ValidationCondition.OLDER_CLIENT_REQUESTS (apache#4051) HDDS-7708. No check for certificate duration config scenarios. (apache#4149) HDDS-7727. EC: SCM unregistered event handler for DatanodeCommandCountUpdated (apache#4147) HDDS-7606. Add SCM HA support in intellij run (apache#4058) HDDS-7666. EC: Unrecoverable EC containers with some remaining replicas may block decommissioning (apache#4118) HDDS-7339. Implement Certificate renewal task for services (apache#3982) HDDS-7696. MisReplicationHandler does not consider QUASI_CLOSED replicas as sources (apache#4144) HDDS-7714. Docker cluster ozone-om-ha fails during docker-compose up (apache#4137) HDDS-7716. Log read requests rejected with permission denied in OM audit (apache#4136) HDDS-7588. Intermittent failure in TestObjectStoreWithLegacyFS#testFlatKeyStructureWithOBS (apache#4040) HDDS-7633. Compile error with Java 11: package com.sun.jmx.mbeanserver is not visible (apache#4077) HDDS-7648. Add a servername tag in UGI metrics. (apache#4094) HDDS-7564. Update Ozone version after 1.3.0 release (apache#4115) ...
What changes were proposed in this pull request?
Add servername tag for UGI metrics.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7648
How was this patch tested?