Skip to content

Conversation

@ChenSammi
Copy link
Contributor

@ChenSammi ChenSammi commented Mar 31, 2023

https://issues.apache.org/jira/browse/HDDS-8325

How was this patch tested?

manual test the patch by checking the /jmx and /prom output of each service.

Before the patch, for every rocksdb instance, there will be two metrics group in jmx, for example, SCM will have these two metrics group,

"name" : "Hadoop:service=StorageContainerManager,name=Rocksdb_scm.db"
"name" : "Hadoop:service=Ozone,name=RocksDbStore,dbName=scm.db" 

After the patch, SCM will have one metrics group for rocksdb,

"name" : "Hadoop:service=StorageContainerManager,name=Rocksdb_scm.db"
  1. blob_db related rocksdb metrics are filtered. Test by search with keyword BLOB_DB in /jmx and /prom output.
  2. "estimated key count" in SCMMetadataStoreMetrics are consolidated.

Before patch,

image

After patch
image

@ChenSammi ChenSammi changed the title HDDS-8325. Consolidate and refine RocksDB metrics of services. HDDS-8325. Consolidate and refine RocksDB metrics of services Mar 31, 2023
@ChenSammi
Copy link
Contributor Author

The findbug issue is irrelevant. File TestOzoneBlockTokenIdentifier is not updated in this patch.

@ChenSammi
Copy link
Contributor Author

Hi @symious, could you help to review this patch at your convenient time?

@kerneltime
Copy link
Contributor

cc @tanvipenumudy can you please take a look as well?

@symious
Copy link
Contributor

symious commented Apr 4, 2023

LGTM.

I see the name of the metric is "Hadoop:service=StorageContainerManager,name=Rocksdb_scm.db", does that mean the metrics only have SCM db's metrics?

@ChenSammi
Copy link
Contributor Author

ChenSammi commented Apr 4, 2023

I see the name of the metric is "Hadoop:service=StorageContainerManager,name=Rocksdb_scm.db", does that mean the metrics only have SCM db's metrics?

Yes, "Hadoop:service=StorageContainerManager,name=Rocksdb_scm.db" will have all exposed rocksdb metrics for SCM DB.

Copy link
Contributor

@tanvipenumudy tanvipenumudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me @ChenSammi, thanks.

@adoroszlai
Copy link
Contributor

The findbug issue is irrelevant. File TestOzoneBlockTokenIdentifier is not updated in this patch.

I don't understand why this findbugs issue is not flagged on other commits and PRs. Fixing it in #4517.

@ChenSammi ChenSammi merged commit 910eef0 into apache:master Apr 4, 2023
@ChenSammi
Copy link
Contributor Author

Thanks @symious and @tanvipenumudy for the code review.

errose28 added a commit to errose28/ozone that referenced this pull request Apr 6, 2023
* master: (155 commits)
  update readme (apache#4535)
  HDDS-8374. Disable flaky unit test: TestContainerStateCounts
  HDDS-8016. updated the ozone doc for linked bucket and deletion async limitation (apache#4526)
  HDDS-8237. [Snapshot] loadDb() used by SstFiltering service creates extraneous directories. (apache#4446)
  HDDS-8035. Intermittent timeout in TestOzoneManagerHAWithData.testOMHAMetrics (apache#4362)
  HDDS-8039. Allow container inspector to run from ozone debug. (apache#4337)
  HDDS-8304. [Snapshot] Reduce flakiness in testSkipTrackingWithZeroSnapshot (apache#4487)
  HDDS-7974. [Snapshot] KeyDeletingService to be aware of Ozone snapshots (apache#4486)
  HDDS-8368. ReplicationManager: Create ContainerReplicaOp with correct target Datanode (apache#4532)
  HDDS-8358. Fix the space usage comparator in ContainerBalancerSelectionCriteria (apache#4527)
  HDDS-8359. ReplicationManager: Fix getContainerReplicationHealth() so that it builds ContainerCheckRequest correctly (apache#4528)
  HDDS-8361. Useless object in TestOzoneBlockTokenIdentifier (apache#4517)
  HDDS-8325. Consolidate and refine RocksDB metrics of services (apache#4506)
  HDDS-8135. Incorrect synchronization during certificate renewal in DefaultCertificateClient. (apache#4381)
  HDDS-8127. Exclude deleted containers from Recon container count (apache#4440)
  HDDS-8364. ReadReplicas may give wrong results with topology-aware read enabled (apache#4522)
  HDDS-8354. Avoid WARNING about ObjectEndpoint#get (apache#4515)
  HDDS-8324. DN data cache gets removed randomly asking for data from disk (apache#4499)
  HDDS-8291. Upgrade to Hadoop 3.3.5 (apache#4484)
  HDDS-8355. Mark TestOMRatisSnapshots#testInstallSnapshot as flaky
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants