Skip to content

Conversation

@muskan1012
Copy link
Contributor

What changes were proposed in this pull request?

Introduced following metrics for listKey operations:

  • listKeysAveragePagination
  • listKeysOpsPerSec

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9874

How was this patch tested?

manually tested it on a cluster

@adoroszlai
Copy link
Contributor

@muskan1012 There are checkstyle and findbugs failures. It would be nice to wait with PR creation until a clean CI run in your fork is completed.

https://github.com/muskan1012/ozone/actions/runs/7128100497

@muskan1012 muskan1012 marked this pull request as draft December 7, 2023 12:38
private MutableRate listKeysAveragePagination;

@Metric(about = "ops per second for listKeys")
private MutableRate listKeysOpsPerSec;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the Latency metric listKeysLatencyNs, maybe we just need to add a cumulative number of listKeys metric(such as: listKeysSize), which we can calculate to get listKeysOpsPerSec and listKeysAveragePagination.

@Metric(about = "listKeys latency in nanoseconds")
private MutableRate listKeysLatencyNs;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the cumulative number of listKeys using keyCount, which is set to maxKeys if truncated or cacheKeyMap.size() otherwise.
listKeysAveragePagination is set using keyCount.
listKeysOpsPerSec is calculated as keyCount / ((Time.monotonicNowNanos() - startNanos) / 1_000_000_000.0f).
Added latency tracking with addListKeysReadFromRocksDbLatencyNs.
This approach captures the metrics as you suggested.

@kerneltime
Copy link
Contributor

cc @tanvipenumudy can you please take a look as well?

@muskan1012 muskan1012 marked this pull request as ready for review December 18, 2023 07:03
Copy link
Contributor

@tanvipenumudy tanvipenumudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @muskan1012 for the patch, please find a few comments.

@kerneltime
Copy link
Contributor

@muskan1012 can you please rebase the change and address the conflicts?

@tanvipenumudy
Copy link
Contributor

Thank you @muskan1012 for updating the patch, could you please take a look at the build failure?

@adoroszlai adoroszlai marked this pull request as draft January 27, 2024 14:24
@tanvipenumudy tanvipenumudy marked this pull request as draft June 18, 2024 05:23
@tanvipenumudy tanvipenumudy marked this pull request as ready for review June 18, 2024 05:23
@tanvipenumudy
Copy link
Contributor

@muskan1012, the failed checks appear related to the changes (NPE while initializing a new OmMetadataManager). Could you please address this issue similarly to how the previous test failures were handled?

Could you also please rebase the patch? Thanks!

@tanvipenumudy
Copy link
Contributor

Thank you @muskan1012 for rebasing and fixing the test failures.

Reran the workflow; could you please take a look at the integration test failure on TestListKeys? Thanks!

@tanvipenumudy
Copy link
Contributor

tanvipenumudy commented Jun 21, 2024

@adoroszlai, @xichen01 could you please take another look at the patch? Thanks

@tanvipenumudy
Copy link
Contributor

How was this patch tested?
manually tested it on a cluster

@muskan1012, it would help if you could also please share the updated list keys dashboard (or prometheus/graphana metrics - whichever works!) once all changes are incorporated, thanks.

@adoroszlai adoroszlai self-requested a review June 27, 2024 10:09
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @muskan1012 for updating the patch.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @muskan1012 for updating the patch. Mostly looks good, some unnecessary changes are still left over.

Also, can you please consider @kerneltime's suggestion to rename averagePagination to keyCount?

Co-authored-by: Doroszlai, Attila <[email protected]>
@muskan1012
Copy link
Contributor Author

Hi @adoroszlai, thank you for reviewing it. Fixed minor nits as well.

@adoroszlai adoroszlai dismissed their stale review July 1, 2024 06:28

patch updated

@adoroszlai adoroszlai requested a review from tanvipenumudy July 1, 2024 06:28
Copy link
Contributor

@tanvipenumudy tanvipenumudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @muskan1012 for working on the patch iteratively, thank you @adoroszlai, @kerneltime, @xichen01 for the reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants