Skip to content

Conversation

@swamirishi
Copy link
Contributor

What changes were proposed in this pull request?

With implementation of ReclaimableKeyFilter both snapshot deep cleaning and AOS garbage collection can be unified which will simplify the entire implementation rather than having multiple implementations for deep cleaning.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13025

How was this patch tested?

Existing Unit tests and additional unit tests added.

swamirishi added 15 commits May 11, 2025 20:12
… each and every ratis request

Change-Id: I5c29c572df9d2240b1d58fbc88826eb5ed8ad881
…shotInfo

Change-Id: I16d2881af973799773335343debd856cb763f72b
Change-Id: I67d685aff12daaf98cf6a8eb5f6a5750c4cf6394
Change-Id: I4215cacc0f0486a7aa46b60971149483028bed38
Change-Id: I7a05f30d0a58538708e5acdbce36d836b6a5542e
Change-Id: I5a9e9dd3cc74e185b1f0745af7f2fee1026d125a

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Change-Id: I86c0adae571debf1e35e86660beb8c6a62d8b6fc
Change-Id: I403f238e8fd1178540da893f73a42aa2223572ad
Change-Id: Id3202dbe7a5a71c43526e80af34b18dd7b2ed66e
Change-Id: I17c1771c7630292fded0eebfbe6ee14f97798506
Change-Id: I03e67781351ed3a18666b32a12109141a5a4a34d
Change-Id: I1712f2dd4138a1f679eb4e4676b3d5ac41c80583
Change-Id: Id4a53d3eb8f23031ef517b3793dc73b067b11d62
Change-Id: I99896a697c523d6c174a06e66935a4be3b2bd541
Change-Id: Ifd53d7bff6a89d50f0b851100a42885d5b24882a
@swamirishi swamirishi requested review from jojochuang and smengcl May 14, 2025 09:48
@swamirishi swamirishi added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label May 14, 2025
Change-Id: I30118d2bf83e18b0d63d017bf97a45e370ef03d4
Change-Id: Ifc43cca62c2ec465d1e6fff340c37720d32ad30b
@swamirishi swamirishi requested a review from sadanand48 May 15, 2025 14:44
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still in the middle of review

*
* @param count max number of keys to return.
* @return a Pair of list of {@link BlockGroup} representing keys and blocks,
* and a hashmap for key-value pair to be updated in the deletedTable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc can add a little more description for how filter is used.

*
* @param count max number of keys to return.
* @return a Pair of list of {@link BlockGroup} representing keys and blocks,
* and a hashmap for key-value pair to be updated in the deletedTable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are volume, bucket, startKey used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Volume, Bucket would be used for getting a PendingKeysDeleted for a Snapshot's KeyManager

* @return a list of {@link BlockGroup} represent keys and blocks.
* @throws IOException
*/
public PendingKeysDeletion getPendingDeletionKeys(final int keyCount,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is similar but not entirely the same as the one in KeyManagerImpl.
It's also super complex and hard to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I moved the function to KeyManagerImpl. Most of the logic is going to be performed in the ReclaimableKeyFilter class

Change-Id: Ie08fad6a07a66c75f6d823a64a7d278d1bfb54bd
Change-Id: Ibaa9a25cfc9126593b9ec3ec913c4a09e0846258
Change-Id: If61d109c573652c586b739baae49980749a4f9fa
Change-Id: I2b6ea2084bbb0320bf35608002faccf52b466b66
Change-Id: I8a11ed9a9cccb70b0cc05b295c0e5289fc859215
Change-Id: If1f9d066e44fe3a324aff71278a0ca1858571166
Change-Id: I533d58c79430f40b9169b04e24e27820456446ca
Change-Id: I8f7267f81c5bf676f40550bc30099f3f339ec32f
Change-Id: I4710594adf3d31d19bb282d5173e81d73a047dd4

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestKeyDeletingService.java
Change-Id: I0500d6901f12722822222ae52336a3fb48623bde
Change-Id: I4c47f80b83382ab5f5af735a983f0585b2f334fe
Change-Id: I1b56a46d2f13c8173213c4bad68e34f017433d33
@swamirishi swamirishi marked this pull request as ready for review May 22, 2025 05:00
Comment on lines 97 to 98
LOG.error("Snapshot with ID: '{}' is not found.", snapshotId);
throw new OMException("Snapshot with ID: '" + snapshotId + "' is not found.",
Copy link
Contributor

@smengcl smengcl May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UUID would be printed as a long string so I'd prefer the message itself to be in the front for slightly better readability when the time comes for sifting through the logs

Suggested change
LOG.error("Snapshot with ID: '{}' is not found.", snapshotId);
throw new OMException("Snapshot with ID: '" + snapshotId + "' is not found.",
LOG.error("Snapshot not found with UUID '{}'", snapshotId);
throw new OMException("Snapshot not found with UUID '" + snapshotId + "'",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@smengcl smengcl changed the title HDDS-13025. Refactor KeyDeleting Service to use ReclaimableKeyFilter HDDS-13025. Refactor KeyDeletingService to use ReclaimableKeyFilter May 22, 2025
}

public HashMap<String, RepeatedOmKeyInfo> getKeysToModify() {
public Map<String, RepeatedOmKeyInfo> getKeysToModify() {
Copy link
Contributor

@smengcl smengcl May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time I see such public interface / methods changes like this I am thinking we should adopt @InterfaceAudience annotation more. Like putting this before this class as well:

@InterfaceAudience.Private
@InterfaceStability.Evolving
public class ReloadingX509KeyManager extends X509ExtendedKeyManager implements CertificateNotification {

This informs the user of this API that this is meant to be used internally by Ozone only and we are not responsible for any breakage to the application as a result of API evolvements in this class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed HDDS-13106 for that.

Comment on lines +299 to +303
if (snapshotId == null) {
LOG.debug("Running KeyDeletingService for active object store, {}", run);
isRunningOnAOS.set(true);
} else {
LOG.debug("Running KeyDeletingService for snapshot : {}, {}", snapshotId, run);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like changing the message for snapshots to INFO would be useful.

Suggested change
if (snapshotId == null) {
LOG.debug("Running KeyDeletingService for active object store, {}", run);
isRunningOnAOS.set(true);
} else {
LOG.debug("Running KeyDeletingService for snapshot : {}, {}", snapshotId, run);
if (snapshotId == null) {
LOG.debug("Running KeyDeletingService for active object store, {}", run);
isRunningOnAOS.set(true);
} else {
LOG.info("Running KeyDeletingService for snapshot : {}, {}", snapshotId, run);

* Submits SetSnapsnapshotPropertyRequest to OM.
* @param setSnapshotPropertyRequests request to be sent to OM
*/
protected void submitSetSnapshotRequest(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: plural

Suggested change
protected void submitSetSnapshotRequest(
protected void submitSetSnapshotRequests(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we are submitting only OM request

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 664 to 667
.forEach(i -> {
try {
val.put(i.getSnapshotKey(),
ozoneManager.getMetadataManager().getSnapshotInfoTable().get(i.getSnapshotKey()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
.forEach(i -> {
try {
val.put(i.getSnapshotKey(),
ozoneManager.getMetadataManager().getSnapshotInfoTable().get(i.getSnapshotKey()));
.forEach(request -> {
try {
val.put(request.getSnapshotKey(),
ozoneManager.getMetadataManager().getSnapshotInfoTable().get(request.getSnapshotKey()));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the entire block

.addAllSetSnapshotPropertyRequests(setSnapshotPropertyRequests)
.setClientId(clientId.toString())
.build();
Map<String, SnapshotInfo> val = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this Map used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debugging something. Forgot to revert this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this

Comment on lines 192 to 193
exclusiveSizeMap.remove(snapshotID);
exclusiveReplicatedSizeMap.remove(snapshotID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is frowned upon. Looks like bad isolation.

Can we at least put a javadoc on this helper method saying this method drains both maps?

Change-Id: Ia6d6de4972e4c9244083a2675d2fe5aeb1b3b2ee
Change-Id: I88043db279302b271d32997022a746b6546221f7
@swamirishi swamirishi marked this pull request as draft May 22, 2025 22:02
Change-Id: If65ffd82cd0b772a8bad9e9040a3084a71262993
Change-Id: I970f9e8918ec93b45ed4b9ee5512625c61aee639
@swamirishi swamirishi marked this pull request as ready for review May 23, 2025 02:31
Change-Id: Ib0171a1593e2c190a5ae06b58abb50a3d8234739
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Thanks @swamirishi for the effort.

The patch significantly simplified getPendingDeletionKeys() with the use of ReclaimableKeyFilter, which is a delight to see.

@swamirishi swamirishi merged commit 322ca93 into apache:master May 24, 2025
42 checks passed
@swamirishi
Copy link
Contributor Author

Thanks for reviewing the patch @smengcl & @jojochuang

aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Jun 9, 2025
…239-container-reconciliation

Commits: 80 commits
5e273a4 HDDS-12977. Fail build on dependency problems (apache#8574)
5081ba2 HDDS-13034. Refactor DirectoryDeletingService to use ReclaimableDirFilter and ReclaimableKeyFilter (apache#8546)
e936e4d HDDS-12134. Implement Snapshot Cache lock for OM Bootstrap (apache#8474)
31d13de HDDS-13165. [Docs] Python client developer guide. (apache#8556)
9e6955e HDDS-13205. Bump common-custom-user-data-maven-extension to 2.0.3 (apache#8581)
750b629 HDDS-13203. Bump Bouncy Castle to 1.81 (apache#8580)
ba5177e HDDS-13202. Bump build-helper-maven-plugin to 3.6.1 (apache#8579)
07ee5dd HDDS-13204. Bump awssdk to 2.31.59 (apache#8582)
e1964f2 HDDS-13201. Bump jersey2 to 2.47 (apache#8578)
81295a5 HDDS-13013. [Snapshot] Add metrics and tests for snapshot operations. (apache#8436)
b3d75ab HDDS-12976. Clean up unused dependencies (apache#8521)
e0f08b2 HDDS-13179. rename-generated-config fails on re-compile without clean (apache#8569)
f388317 HDDS-12554. Support callback on completed reconfiguration (apache#8391)
c13a3fe HDDS-13154 Link more Grafana dashboard json files to the Observability user doc (apache#8533)
2a761f7 HDDS-11967. [Docs]DistCP Integration in Kerberized environment. (apache#8531)
81fc4c4 HDDS-12550. Use DatanodeID instead of UUID in NodeManager CommandQueue. (apache#8560)
2360af4 HDDS-13169. Intermittent failure in testSnapshotOperationsNotBlockedDuringCompaction (apache#8553)
f19789d HDDS-13170. Reclaimable filter should always reclaim entries when buckets and volumes have already been deleted (apache#8551)
315ef20 HDDS-13175. Leftover reference to OM-specific trash implementation (apache#8563)
902e715 HDDS-13159. Refactor KeyManagerImpl for getting deleted subdirectories and deleted subFiles (apache#8538)
46a93d0 HDDS-12817. Addendum rename ecIndex to replicaIndex in chunkinfo output (apache#8552)
19b9b9c HDDS-13166. Set pipeline ID in BlockExistenceVerifier to avoid cached pipeline with different node (apache#8549)
b3ff67c HDDS-13068. Validate Container Balancer move timeout and replication timeout configs (apache#8490)
7a7b9a8 HDDS-13139. Introduce bucket layout flag in freon rk command (apache#8539)
3c25e7d HDDS-12595. Add verifier for container replica states (apache#8422)
6d59220 HDDS-13104. Move auditparser acceptance test under debug (apache#8527)
8e8c432 HDDS-13071. Documentation for Container Replica Debugger Tool (apache#8485)
0e8c8d4 HDDS-13158. Bump junit to 5.13.0 (apache#8537)
8e552b4 HDDS-13157. Bump exec-maven-plugin to 3.5.1 (apache#8534)
168f690 HDDS-13155. Bump jline to 3.30.4 (apache#8535)
cc1e4d1 HDDS-13156. Bump awssdk to 2.31.54 (apache#8536)
3bfb7af HDDS-13136. KeyDeleting Service should not run for already deep cleaned snapshots (apache#8525)
006e691 HDDS-12503. Compact snapshot DB before evicting a snapshot out of cache (apache#8141)
568b228 HDDS-13067. Container Balancer delete commands should not be sent with an expiration time in the past (apache#8491)
53673c5 HDDS-11244. OmPurgeDirectoriesRequest should clean up File and Directory tables of AOS for deleted snapshot directories (apache#8509)
07f4868 HDDS-13099. ozone admin datanode list ignores --json flag when --id filter is used (apache#8500)
08c0ab8 HDDS-13075. Fix default value in description of container placement policy configs (apache#8511)
58c87a8 HDDS-12177. Set runtime scope where missing (apache#8513)
10c470d HDDS-12817. Add EC block index in the ozone debug replicas chunk-info (apache#8515)
7027ab7 HDDS-13124. Respect config hdds.datanode.use.datanode.hostname when reading from datanode (apache#8518)
b8b226c HDDS-12928. datanode min free space configuration (apache#8388)
fd3d70c HDDS-13026. KeyDeletingService should also delete RenameEntries (apache#8447)
4c1c6cf HDDS-12714. Create acceptance test framework for debug and repair tools (apache#8510)
fff80fc HDDS-13118. Remove duplicate mockito-core dependency from hdds-test-utils (apache#8508)
10d5555 HDDS-13115. Bump awssdk to 2.31.50 (apache#8505)
360d139 HDDS-13017. Fix warnings due to non-test scoped test dependencies (apache#8479)
1db1cca HDDS-13116. Bump jline to 3.30.3 (apache#8504)
322ca93 HDDS-13025. Refactor KeyDeletingService to use ReclaimableKeyFilter (apache#8450)
988b447 HDDS-5287. Document S3 ACL classes (apache#8501)
64bb29d HDDS-12777. Use module-specific name for generated config files (apache#8475)
54ed115 HDDS-9210. Update snapshot chain restore test to incorporate snapshot delete. (apache#8484)
87dfa5a HDDS-13014. Improve PrometheusMetricsSink#normalizeName performance (apache#8438)
7cdc865 HDDS-13100. ozone admin datanode list --json should output a newline at the end (apache#8499)
9cc4194 HDDS-13089. [snapshot] Add an integration test to verify snapshotted data can be read by S3 SDK client (apache#8495)
cb9867b HDDS-13065. Refactor SnapshotCache to return AutoCloseSupplier instead of ReferenceCounted (apache#8473)
a88ff71 HDDS-10979. Support STANDARD_IA S3 storage class to accept EC replication config (apache#8399)
6ec8f85 HDDS-13080. Improve delete metrics to show number of timeout DN command from SCM (apache#8497)
3bb8858 HDDS-12378. Change default hdds.scm.safemode.min.datanode to 3 (apache#8331)
0171bef HDDS-13073. Set pipeline ID in checksums verifier to avoid cached pipeline with different node (apache#8480)
5c7726a HDDS-11539. OzoneClientCache `@PreDestroy` is never called (apache#8493)
a8ed19b HDDS-13031. Implement a Flat Lock resource in OzoneManagerLock (apache#8446)
e9e8b30 HDDS-12935. Support unsigned chunked upload with STREAMING-UNSIGNED-PAYLOAD-TRAILER (apache#8366)
7590268 HDDS-13079. Improve logging in DN for delete operation. (apache#8489)
435fe7e HDDS-12870. Fix listObjects corner cases (apache#8307)
eb5dabd HDDS-12926. Remove *.tmp.* exclusion in DU (apache#8486)
eeb98c7 HDDS-13030. Snapshot Purge should unset deep cleaning flag for next 2 snapshots in the chain (apache#8451)
6bf121e HDDS-13032. Support proper S3OwnerId representation (apache#8478)
5d1b43d HDDS-13076. Refactor OzoneManagerLock class to rename Resource class to LeveledResource (apache#8482)
bafe6d9 HDDS-13064. [snapshot] Add test coverage for SnapshotUtils.isBlockLocationInfoSame() (apache#8476)
7035846 HDDS-13040. Add user doc highlighting the difference between Ozone ACL and S3 ACL. (apache#8457)
1825cdf HDDS-13049. Deprecate VolumeName & BucketName in OmKeyPurgeRequest and prevent Key version purge on Block Deletion Failure (apache#8463)
211c76c HDDS-13060. Change NodeManager.addDatanodeCommand(..) to use DatanodeID (apache#8471)
f410238 HDDS-13061. Add test for key ACL operations without permission (apache#8472)
d1a2f48 HDDS-13057. Increment block delete processed transaction counts regardless of log level (apache#8466)
0cc6fcc HDDS-13043. Replace != with assertNotEquals in TestSCMContainerPlacementRackAware (apache#8470)
e1c779a HDDS-13051. Use DatanodeID in server-scm. (apache#8465)
35e1126 HDDS-13042. [snapshot] Add future proofing test cases for unsupported file system API (apache#8458)
619c05d HDDS-13008. Exclude same SST files when calculating full snapdiff (apache#8423)
21b49d3 HDDS-12965. Fix warnings about "used undeclared" dependencies (apache#8468)
8136119 HDDS-13048. Create new module for Recon integration tests (apache#8464)

Conflicts:
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeManager.java
swamirishi added a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025
…eKeyFilter (apache#8450)

(cherry picked from commit 322ca93)

 Conflicts:
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestKeyPurging.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/snapshot/TestSnapshotDeletingServiceIntegrationTest.java
	hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/snapshot/TestSnapshotDirectoryCleaningService.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManager.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/PendingKeysDeletion.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestKeyDeletingService.java

Change-Id: I0bfd8e472e262124a1001f0795f2a30b0e515f3b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants