Skip to content

Conversation

@sumitagrawl
Copy link
Contributor

@sumitagrawl sumitagrawl commented Nov 28, 2022

What changes were proposed in this pull request?

In DirectoryDeletingService, added sub-directory also for deletion including its sub-file and sub-dir. This will be handled automatically by PurgeDiretoriesRequest.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7541

How was this patch tested?

  1. Tested E2E with recursive deletion of directory
  2. UT is updated for the change verifying same

@kerneltime
Copy link
Contributor

cc @duongkame @swamirishi

@duongkame
Copy link
Contributor

Thanks for the patch @sumitagrawl.

Maybe I'm missing something, I got a question (maybe it's out of scope for this change). Why doesn't DirectoryDeletingService explore all the deleted files (recursively) and put items to deletedKeyTable by itself instead of taking the Ratis path to call DirectoriesPurgeRequest? The current process looks pretty cumbersome to me.

I assume when a directory is deleted, the existence of the deleted directory in deletedDirectoryTable is reflected in all OM replicas, and so DirectoryDeletingService run with the same input per replica.

@sumitagrawl
Copy link
Contributor Author

sumitagrawl commented Nov 30, 2022

Thanks for the patch @sumitagrawl.

Maybe I'm missing something, I got a question (maybe it's out of scope for this change). Why doesn't DirectoryDeletingService explore all the deleted files (recursively) and put items to deletedKeyTable by itself instead of taking the Ratis path to call DirectoriesPurgeRequest? The current process looks pretty cumbersome to me.

I assume when a directory is deleted, the existence of the deleted directory in deletedDirectoryTable is reflected in all OM replicas, and so DirectoryDeletingService run with the same input per replica.

As understanding, below are reason,

  1. For HA, deletion of operation needs to be synchronized to other OM in cluster, so it make use of Ratis for this replication, so ratis path is taken. The rocks db handling is done in ratis for this purpose.
  2. The task are done in batches to avoid resource consumed entirely for delete service, as batch can be too big. A configuration for handling limit in every iteration is present, default 10,000. Also Ratis synchronization is done with this limit.
  3. The task include other things also, notifying SCM for delete of blocks which can consume time. So all these things are done in DirectoryDeletingService to reduce load in Raits flow which will be blocking other activity if taking time.

@duongkame Please share if above make sense.

Copy link
Contributor

@sadanand48 sadanand48 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sumitagrawl for the patch. +1 LGTM.

@sadanand48 sadanand48 merged commit b532d2f into apache:master Dec 5, 2022
Galsza pushed a commit to Galsza/ozone that referenced this pull request Dec 7, 2022
errose28 added a commit to errose28/ozone that referenced this pull request Dec 12, 2022
* master: (110 commits)
  HDDS-7472. EC: Fix NSSummaryEndpoint#getDiskUsage for EC keys (apache#3987)
  HDDS-5704. Ozone URI syntax description in help content needs to mention about ozone service id (apache#3862)
  HDDS-7555. Upgrade Ratis to 2.4.2-8b8bdda-SNAPSHOT. (apache#4028)
  HDDS-7541. FSO recursive delete directory with hierarchy takes much time for cleanup (apache#4008)
  HDDS-7581. Fix update-jar-report for snapshot (apache#4034)
  HDDS-7253. Fix exception when '/' in key name (apache#4038)
  HDDS-7579. Use Netty 4.1.77 for consistency (apache#4031)
  HDDS-7562. Suppress warning about long filenames in tar (apache#4017)
  HDDS-7563. Add a handler for under replicated Ratis containers in RM (apache#4025)
  HDDS-7497. Fix mkdir does not update bucket's usedNamespace (apache#3969)
  HDDS-7567. Invalid entries in LICENSE (apache#4020)
  HDDS-7575. Correct showing of RATIS-THREE icon in Recon UI (apache#4026)
  HDDS-7540. Let reusable workflow inherit secrets (apache#4012)
  HDDS-7568. Bump copyright year in NOTICE (apache#4018)
  HDDS-7394. OM RPC FairCallQueue decay decision metrics list caller username in the metric (apache#3878)
  HDDS-7510. Recon: Return number of open containers in `/clusterState` endpoint (apache#3989)
  HDDS-7561. Improve setquota, clrquota CLI usage (apache#4016)
  HDDS-6615. EC: Improve write performance by pipelining encode and flush (apache#3994)
  HDDS-7554. Recon UI should show DORMANT in pipeline status filter (apache#4010)
  HDDS-7540. Separate scheduled CI from push/PR workflows (apache#4004)
  ...
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Feb 21, 2023
…ime for cleanup (apache#4008)

(cherry picked from commit b532d2f)

 Conflicts:
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DirectoryDeletingService.java

Change-Id: I34abb92b73ff9657bc1fb3a26e6bd4df0b456bc9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants