-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13213. KeyDeletingService should limit task size by both key count and serialized size. #8757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Enforce a cumulative serialized-size limit on deletion tasks alongside the existing key-count limit.
- Introduce
ratisByteLimitin theKeyManagerAPI and implementations to stop fetching when serialized size crosses the threshold. - Update
KeyDeletingServiceandSnapshotDeletingServiceto compute and pass theratisByteLimit. - Extend tests to configure and use the new
ratisLimitparameter in deletion calls.
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| KeyManager.java | Added ratisByteLimit parameter to pending-deletion and rename-fetch methods. |
| KeyManagerImpl.java | Implemented size accumulation and break logic based on ratisByteLimit. |
| KeyDeletingService.java | Initialized ratisByteLimit and supplied it to deletion fetch calls. |
| SnapshotDeletingService.java | Updated calls to fetch deleted keys/renames to include ratisByteLimit (missing initialization). |
| DirectoryDeletingService.java | Refined purge-path submission logic. |
| TestKeyDeletingService.java | Configured and passed ratisLimit in test calls. |
| TestKeyManagerImpl.java | Modified tests to supply ratisLimit to deletion and rename entry methods. |
| TestKeyPurging.java | Added ratisLimit setup for integration test. |
| MapBackedTableIterator.java | Updated test iterator to account for byte-size in Table.newKeyValue. |
Comments suppressed due to low confidence (3)
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManager.java:160
- [nitpick] The parameter name
ratisLimitdiffers fromratisByteLimitused elsewhere; consider unifying the parameter name across methods for consistency.
throws IOException;
hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestKeyDeletingService.java:1064
- Tests currently validate normal deletion flows but don’t cover the scenario where cumulative serialized size hits the Ratis byte limit and truncates results. Consider adding a test case that creates a set of keys whose total size exceeds
ratisLimitto verify the truncation behavior.
.getKeyBlocksList().size();
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java:180
- The field
ratisByteLimitis not defined or initialized inSnapshotDeletingService, causing compilation errors. Consider adding a class member and initializing it in the constructor similar toKeyDeletingService.
null, (kv) -> true, remaining, ratisByteLimit);
...ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestKeyPurging.java
Show resolved
Hide resolved
| if (serializedSize > ratisByteLimit) { | ||
| flag = true; | ||
| LOG.info( | ||
| "Total size of cumulative keys in a cycle crossed 90% ratis limit, serialized size: {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a common occurrence, and this log will flood the logs. I would recommend adding a metric that be charted for measuring progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my testing, this scenario doesn't appear to be very common. In a typical production case, a key might have around 10 blocks. From test results, 20,000 such keys occupy roughly 2 MB (1.8 MB to be exact). Given that 90% of the default Ratis limit is about 29 MB, it would take approximately 300,000 keys in a single iteration to reach that limit — which seems unlikely in most practical scenarios.
Moreover, to successfully delete 300,000 keys in one iteration, the following configurations would need to be tuned:
hdds.scm.block.deletion.per-interval.max (on SCM)
hdds.datanode.block.deleting.limit.per.interval (on Datanodes)
Given the above, it makes sense to downgrade this log message to DEBUG level.
Let me know your thoughts — open to suggestions.
hadoop-hdds/framework/src/test/java/org/apache/hadoop/hdds/utils/MapBackedTableIterator.java
Outdated
Show resolved
Hide resolved
ashishkumar50
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aryangupta1998 Thanks for updating patch, LGTM.
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@aryangupta1998 The check applied on KeyDeletingService is wrong to check for ratis request serialized size is wrong. We don’t send OmKeyInfo in the PurgeKeyRequest and send only the rocksdb key from deletedKeyTable. This calculation is unnecessary. We might need this check only for modified version keys but given that we would have only one OmKeyInfo in RepeatedKeyInfo we wouldn’t require any ratis serialized checks in KeyDeletingService. |
* master: (90 commits) HDDS-13308. OM should expose Ratis config for increasing pending write limits (apache#8668) HDDS-8903. Add validation for ozone.om.snapshot.db.max.open.files. (apache#8787) HDDS-13429. Custom metadata headers with uppercase characters are not supported (apache#8805) HDDS-13448. DeleteBlocksCommandHandler thread stop for normal exception (apache#8816) HDDS-13346. Intermittent failure in TestCloseContainer#testContainerChecksumForClosedContainer (apache#8771) HDDS-13125. Add metrics for monitoring the SST file pruning threads. (apache#8764) HDDS-13367. [Docs] User doc for container balancer. (apache#8726) HDDS-13200. OM RocksDB Grafana Dashbroad shows no data on all panels (apache#8577) HDDS-13428. Recon - Retrigger of build whole NSSummary tree task submission inconsistency. (apache#8793) HDDS-13378. [Docs] Add a Production page under Getting Started (apache#8734) HDDS-13403. [Docs] Make feature proposal process more visible. (apache#8758) HDDS-11797. Remove cyclic dependency between SCMSafeModeManager and SafeModeRules (apache#8782) HDDS-13213. KeyDeletingService should limit task size by both key count and serialized size. (apache#8757) HDDS-13387. OMSnapshotCreateRequest logs invalid warning about DefaultReplicationConfig (apache#8760) HDDS-13405. ozone admin container create runs forever without kinit (apache#8765) HDDS-11514. Set optimal default values for delete configurations based on live cluster testing. (apache#8766) HDDS-13376. Add server-side limit note to ozone sh snapshot diff --page-size option (apache#8791) HDDS-11679. Support multiple S3Gs in MiniOzoneCluster (apache#8733) HDDS-13424. Use lsof instead of fuser to find if file is used in AbstractTestChunkManager (apache#8790) HDDS-13427. Bump awssdk to 2.31.78 (apache#8792) ...
…nt and serialized size. (apache#8757)
… key count and serialized size. (apache#8757)" This reverts commit 918b284.
What changes were proposed in this pull request?
While fetching keys for deletion based on keyLimitPerTask, we should also check that the size of the keys cumulatively does not exceed the Ratis limit (default: 32 MB).
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13213
How was this patch tested?
Tested Manually.