Skip to content

Conversation

@aswinshakil
Copy link
Member

What changes were proposed in this pull request?

To support the following for Snapshot Garbage Collection.

  1. To handle renamed keys between snapshots.
  2. To accommodate FSO bucket keys.
  3. Implement OMSnapshotPurgeRequest cleaning up SnapshotChain, checkpoint directory.
  4. Set limit for number of deletedKeys, similar to KeyDeletingService

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7883

How was this patch tested?

Unit tests.

@aswinshakil aswinshakil changed the base branch from master to 4440 March 16, 2023 03:27
@aswinshakil aswinshakil changed the base branch from 4440 to master March 16, 2023 03:27
@smengcl smengcl added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Mar 16, 2023
Comment on lines 226 to 230
for (OmKeyInfo keyInfo: repeatedOmKeyInfo.getOmKeyInfoList()) {
splitRepeatedOmKeyInfo(toReclaim, toNextDb,
keyInfo, previousKeyTable);
keyInfo, previousKeyTable, renamedKeyTable,
bucketInfo, volumeId);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be fine as long as we are reading volume/bucket info directly from active DB. Since we don't allow deletion of buckets with snapshots in the first place. (And this is the important assumption here.)

Originally I thought we are going to get volume and bucket info from snapshot DB, in which case we would have to retain these two tables in SstFilteringService (which is not the case at the moment).

Comment on lines +203 to +204
while (deletedIterator.hasNext() &&
deletionCount <= keyLimitPerSnapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a block comment right above the loop explaining that we are trying to gather as much key blocks as possible in one SCM block deletion request here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add it once we are sending the actual request to SCM


// These are uncommitted blocks wrapped into a pseudo KeyInfo
if (deletedKeyInfo.getObjectID() == OBJECT_ID_RECLAIM_BLOCKS) {
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we won't reclaim those blocks (e.g. added by Hsync) in deletedTable? while actually the blocks could be deleted if the block ID doesn't match.

This may deserve a TODO here (Add block ID checking). Also, add more comments regarding the desired GC behavior.

Make sure when deferring this entry to the next snapshot or active DB, it is merged properly into the same OmKeyInfo. The blocks cannot be thrown away without reclamation (leaking blocks on SCMs/DNs otherwise).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will reclaim these blocks on the deletedTable, here these uncommitted blocks are put in reclaimList which will be reclaimed by KeyDeletingService.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending findbugs in test

@smengcl smengcl changed the title HDDS-7883. [Snapshot] Accommodate FSO, Key renames and implement OMSnapshotPurgeRequest for SnapshotDeletingService HDDS-7883. [Snapshot] Accommodate FSO, key renames and implement OMSnapshotPurgeRequest for SnapshotDeletingService Mar 21, 2023
@smengcl smengcl merged commit 184bfc2 into apache:master Mar 21, 2023
@smengcl
Copy link
Contributor

smengcl commented Mar 21, 2023

Thanks @aswinshakil for the patch

errose28 added a commit to errose28/ozone that referenced this pull request Mar 23, 2023
* master: (43 commits)
  HDDS-8148. Improve log for Pipeline creation failure (apache#4385)
  HDDS-7853. Add support for RemoveSCM in SCMRatisServer. (apache#4358)
  HDDS-8042. Display certificate issuer in cert list command. (apache#4429)
  HDDS-8189. [Snapshot] renamedKeyTable should only track keys in buckets that has at least one active snapshot. (apache#4436)
  HDDS-8154. Perf: Reuse Mac instances in S3 token validation (apache#4433)
  HDDS-8245. Info log for keyDeletingService when nonzero number of keys are deleted. (apache#4451)
  HDDS-8233. ReplicationManager: Throttle delete container commands from over-replication handlers (apache#4447)
  HDDS-8220. [Ozone-Streaming] Trigger volume check on IOException in StreamDataChannelBase (apache#4428)
  HDDS-8173. Fix to remove enrties from RocksDB after container gets deleted. (apache#4445)
  HDDS-7975. Rebalance acceptance tests (apache#4437)
  HDDS-8152. Reduce S3 acceptance test setup time (apache#4393)
  HDDS-8172. ECUnderReplicationHandler should consider commands already sent when processing the container (apache#4435)
  HDDS-7883. [Snapshot] Accommodate FSO, key renames and implement OMSnapshotPurgeRequest for SnapshotDeletingService (apache#4407)
  HDDS-8168. Make deadlines inside MoveManager for move commands configurable (apache#4415)
  HDDS-7918. EC: ECBlockReconstructedStripeInputStream should check for spare replicas before failing an index (apache#4441)
  HDDS-8222. EndpointBase#getBucket should handle BUCKET_NOT_FOUND (apache#4431)
  HDDS-8068. Fix Exception: JMXJsonServlet, getting attribute RatisRoles of Hadoop:service=OzoneManager. (apache#4352)
  HDDS-8139. Datanodes should not drop block delete transactions based on transaction ID (apache#4384)
  HDDS-8216. EC: OzoneClientConfig is overwritten in ECKeyOutputStream (apache#4425)
  HDDS-8054. Fix NPE in metrics for failed volume (apache#4340)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants