Skip to content

Conversation

@peterxcli
Copy link
Member

What changes were proposed in this pull request?

The OM RocksDB shows no data in each panel even I have enable the rocksdb statistics.

I have to click into each chart and click refresh so they could show data.
image
image
image
image

I found the cause is the error in frontend:
image

Then the dashbroad become good after I replace the ${DS_PROMETHEUS} with "-- Grafana ---".
image

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13200

How was this patch tested?

No test needed.

…UID from "${DS_PROMETHEUS}" to "-- Grafana --" for consistency.
@peterxcli peterxcli requested a review from jojochuang June 7, 2025 05:18
@peterxcli peterxcli self-assigned this Jun 7, 2025
@peterxcli peterxcli marked this pull request as draft June 7, 2025 05:30
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at other dashboard files and it seems like -- Grafana -- is only specified once in a file. It seems like the datasource only requires type: "prometheus" and does not need uid.

@peterxcli
Copy link
Member Author

I'm looking at other dashboard files and it seems like -- Grafana -- is only specified once in a file. It seems like the datasource only requires type: "prometheus" and does not need uid.

Yeah, and it seems if the uid is set to "-- Grafana --", some panel's datasource would be wrongly set to random walk?
image

And after removing all uid fields, the source becomes correct. 👍
image
(This is a null cluster)

@peterxcli peterxcli requested a review from jojochuang June 7, 2025 07:58
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
Copy link
Contributor

@smengcl smengcl Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like DS_PROMETHEUS is supposed to be taken as input?

"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "prometheus",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I have no idea which field would be taken as uid

Copy link
Member Author

@peterxcli peterxcli Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least the web doesnt resolve or substitute it
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think DS_prometheus can be removed from here too. In Ozone - OMComittedIndexMetrics.json, Ozone - Ozone Manager RocksDB.json
(Refering from Ozone - ReadKey Metrics.json)
@peterxcli @smengcl does that sound fine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Removed. Thanks for checking this!

@peterxcli peterxcli marked this pull request as ready for review June 7, 2025 08:05
Copy link
Contributor

@Tejaskriya Tejaskriya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With these changes, the dashboard does seem to work without needing a refresh, but I would suggest to remove all "uid" field from the json files. If you check some other dashboards, they do not specify a "uid" field anywhere.
(refering to the --Grafana-- and the uid at the end of the file)

@peterxcli
Copy link
Member Author

Done.
@Tejaskriya Thanks for testing with this change, please take another look😃

@peterxcli peterxcli requested a review from Tejaskriya July 15, 2025 13:52
Copy link
Contributor

@Tejaskriya Tejaskriya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the updates, LGTM!

@peterxcli peterxcli merged commit f8d8696 into apache:master Jul 15, 2025
13 checks passed
@peterxcli
Copy link
Member Author

Merged. Thanks @Tejaskriya, @smengcl, @jojochuang for the review!

@peterxcli peterxcli deleted the fix-om-rocksdb-metrics-dashbroad-no-data branch July 15, 2025 13:57
errose28 added a commit to errose28/ozone that referenced this pull request Jul 22, 2025
* master: (90 commits)
  HDDS-13308. OM should expose Ratis config for increasing pending write limits (apache#8668)
  HDDS-8903. Add validation for ozone.om.snapshot.db.max.open.files. (apache#8787)
  HDDS-13429. Custom metadata headers with uppercase characters are not supported (apache#8805)
  HDDS-13448. DeleteBlocksCommandHandler thread stop for normal exception (apache#8816)
  HDDS-13346. Intermittent failure in TestCloseContainer#testContainerChecksumForClosedContainer (apache#8771)
  HDDS-13125. Add metrics for monitoring the SST file pruning threads. (apache#8764)
  HDDS-13367. [Docs] User doc for container balancer. (apache#8726)
  HDDS-13200. OM RocksDB Grafana Dashbroad shows no data on all panels (apache#8577)
  HDDS-13428. Recon - Retrigger of build whole NSSummary tree task submission inconsistency. (apache#8793)
  HDDS-13378. [Docs] Add a Production page under Getting Started (apache#8734)
  HDDS-13403. [Docs] Make feature proposal process more visible. (apache#8758)
  HDDS-11797. Remove cyclic dependency between SCMSafeModeManager and SafeModeRules (apache#8782)
  HDDS-13213. KeyDeletingService should limit task size by both key count and serialized size. (apache#8757)
  HDDS-13387. OMSnapshotCreateRequest logs invalid warning about DefaultReplicationConfig (apache#8760)
  HDDS-13405. ozone admin container create runs forever without kinit (apache#8765)
  HDDS-11514. Set optimal default values for delete configurations based on live cluster testing. (apache#8766)
  HDDS-13376. Add server-side limit note to ozone sh snapshot diff --page-size option (apache#8791)
  HDDS-11679. Support multiple S3Gs in MiniOzoneCluster (apache#8733)
  HDDS-13424. Use lsof instead of fuser to find if file is used in AbstractTestChunkManager (apache#8790)
  HDDS-13427. Bump awssdk to 2.31.78 (apache#8792)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants