Skip to content

Conversation

@aryangupta1998
Copy link
Contributor

What changes were proposed in this pull request?

When a large amount of data is deleted, we may see slow progress in clearing it out due to the large backlog of pending deletes, even if all deletion services run for their full intervals and collect their maximum number of entries. To speed up deletion further, more threads will need to be configured. This PR helps to support multi-threaded deletion for the key deleting service.

Approach:
By default, we have configured 10 threads. We have a key supplier class with a synchronized get() function, which each thread calls to get a set of key info and then process it. Concurrently, threads call this get() and process the keys until the key limit per task is reached.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11808

How was this patch tested?

Tested Manually.

@aryangupta1998 aryangupta1998 marked this pull request as draft November 26, 2024 13:20
@aryangupta1998 aryangupta1998 marked this pull request as ready for review December 20, 2024 07:56
Copy link
Contributor

@ashishkumar50 ashishkumar50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aryangupta1998, Thanks for working on this, please find few comments inline.

() -> {
try {
return keyManager.getPendingDeletionKeys(Integer.MAX_VALUE)
return keyManager.getPendingDeletionKeys(Integer.MAX_VALUE, keyDeletingService.getDeletedKeySupplier())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests will fail because now same iterator is used by background service and test code. Need to make sure only one is running at a time to get correct result.

try {
deletedKeySupplier.reInitItr();
} catch (IOException ex) {
LOG.error("Unable to get the iterator.", ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return from here if there is an exception, no point in continuing.

@aryangupta1998 aryangupta1998 marked this pull request as draft January 6, 2025 08:00
@adoroszlai
Copy link
Contributor

/pending conflicts

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this issue as un-mergeable as requested.

Please use /ready comment when it's resolved.

Please note that the PR will be closed after 21 days of inactivity from now. (But can be re-opened anytime later...)

conflicts

@github-actions
Copy link

Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.

It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.

It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.

If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants