-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-6560. add general Grafana dashboard #3285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hadoop-ozone/dist/src/main/compose/common/grafana/dashboards/Ozone - Overall Metrics.json
Outdated
Show resolved
Hide resolved
| "type": "prometheus" | ||
| }, | ||
| "exemplar": true, | ||
| "expr": "avg(container_cache_metrics_db_open_latency_avg_time{component=\"datanode\"})", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads wrong, why do we need an avg for a metric reported as an avg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The raw thought is to make the overall graph more straightforward. As there are a lot of DN DB instances in a normal cluster, we could detect the overall trend instead of each DB. Besides that here I put two metrics together with two Y-axises.

Otherwise, it would be hard to distinguish the overall trend with too many instances.

(Even users could add more variables to control the number of exhibiting instances)
…zone - Overall Metrics.json Co-authored-by: Ritesh H Shukla <[email protected]>
|
LGTM |
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Xushaohong for the contribution.
|
Thanks @kerneltime for the review. |
* master: (96 commits) HDDS-6738. Migrate tests with rules in hdds-server-framework to JUnit5 (apache#3415) HDDS-6650. S3MultipartUpload support update bucket usedNamespace. (apache#3404) HDDS-6491. Support FSO keys in getExpiredOpenKeys (apache#3226) HDDS-6596. EC: Support ListBlock from CoordinatorDN (apache#3410) HDDS-6737. Migrate parameterized tests in hdds-server-framework to JUnit5 (apache#3414) HDDS-6660: EC: Add the DN side Reconstruction Handler class. (apache#3399) HDDS-6750. Migrate simple tests in hdds-server-scm to JUnit5 (apache#3417) HDDS-6749. SCM includes itself as peer in addSCM request (apache#3413) HDDS-6657. Improve Ozone integrated Ranger configuration instructions (apache#3365) HDDS-6742. Audit operation category mismatch (apache#3407) HDDS-6748. Intermittent timeout in TestECBlockReconstructedInputStream#testReadDataWithUnbuffer (apache#3416) HDDS-6731. Migrate simple tests in hdds-server-framework to JUnit5 (apache#3412) HDDS-5919. In kubernetes OM HA has circular dependency on service availability (apache#3185) HDDS-6730. Migrate tests in hdds-tools to JUnit5 (apache#3402) HDDS-6630. Explicitly remove node after being chosen (apache#3332) HDDS-6560. Add general Grafana dashboard (apache#3285) HDDS-6704. EC: ReplicationManager - create version of ContainerReplicaCounts applicable to EC (apache#3405) HDDS-6680. Pre-Finalize behaviour for Bucket Layout Feature. (apache#3377) HDDS-6619. Add freon command to run r/w mix workload using ObjectStore APIs (apache#3383) HDDS-6734. ozone admin pipeline list CLI is not backward compatible (apache#3406) ... Conflicts: hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStore.java hadoop-hdds/interface-server/src/main/proto/SCMRatisProtocol.proto hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMDBDefinition.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/metadata/SCMMetadataStoreImpl.java hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/StorageContainerManager.java
What changes were proposed in this pull request?
New Grafana dashboard template.
Support more Envs. Need to have the
componentlabel.More dimensions of metrics.
Support HA situation.

What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6560
How was this patch tested?
manual test