-
Notifications
You must be signed in to change notification settings - Fork 590
HDDS-13188. Track total pending deletion size for FSO directories in OM via Recon #8591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
aryangupta1998
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch, changes look good to me!
devmadhuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aryangupta1998 for the patch. Pls see comments.
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Outdated
Show resolved
Hide resolved
priyeshkaratha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tanvipenumudy for the patch. Changes looks fine for me.
devmadhuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy for the patch. Pls check comment and also correct the PR description to correct the API end point as now you have created new API - /deletePending/dirs/size
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Show resolved
Hide resolved
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Outdated
Show resolved
Hide resolved
| while (iterator.hasNext()) { | ||
| Table.KeyValue<String, OmKeyInfo> kv = iterator.next(); | ||
| OmKeyInfo omKeyInfo = kv.getValue(); | ||
| if (omKeyInfo != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @tanvipenumudy
Currently, we iterate through the DeletedDirectoryTable, calculate the size of each deleted directory, and then find the sum total size. The issue arises when both the parent and child entries are encountered in the DeletedDirectoryTable, resulting in the same size being calculated twice.
Are we going to consider this scenario?
Or is this scenario even possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ArafatKhan2198 for the review, yes for the FSO deleted directory space calculation - it is indeed possible that both the parent and child directory entries might coexist within the deletedDirectoryTable, but this wouldn't result in double counting of the child directory's size because of how Recon processes delete events today, let's consider the below two scenarios for example:
Case 1: /dir1/dir2, if dir1 (parent) is deleted first -> then /dir1 and /dir1/dir2 wouldn't coexist in the deletedDirectoryTable at any given point in time (the directories would land in the deletedDirectoryTable in order from top to bottom) -> this shouldn't cause any impact.
Case 2: /dir1/dir2, if /dir1/dir2 is deleted first following which /dir1 is deleted, then both /dir1/dir2 and /dir1 can coexist in the deletedDirectoryTable.
-
This wouldn't double-calculate the size of
/dir1/dir2when both/dir1/dir2and/dir1are in thedeletedDirectoryTablebecause Recon detaches the/dir1/dir2(child directory) from its parent (/dir1) in the NSSummaryTree as and when it sees this delete directory event. -
Our approach would iterate over Recon's
deletedDirectoryTableentries and recursively calculate the sizes of the entry's sub-directories:- We would first encounter
/dir1/dir2in thedeletedDirectoryTable, recursively calculate the sizes of directories under it. - We would then encounter
/dir1in thedeletedDirectoryTable, but since the child tree is detached -> we would only be recursively calculating the sizes of its existing sub-directories.
- We would first encounter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking another example for better clarity, consider the following directory structure:
- dir1/
- file1 (1 GB)
- dir4/
- file2 (1 GB)
- file6 (1 GB)
- dir2/
- file3 (1 GB)
- dir3/
- file4 (1 GB)
- file5 (1 GB)
Suppose /dir1/dir2/dir3 is deleted first, followed by the deletion of /dir1:
- First, the link between
/dir1/dir2and/dir1/dir2/dir3would be detached. - As per our approach, we would iterate over the
deletedDirectoryTableentries:- We would first encounter:
/dir1/dir2/dir3indeletedDirectoryTable
→/dir1/dir2/dir3.getSizeOfAllFiles()= 2 GB (file4 and file5) - We would then encounter:
/dir1indeletedDirectoryTable
→ size of/dir1would be calculated as:/dir1.sizeOfAllFiles()→ 1 GB (file1) +recursiveSizeOfSubDirs(/dir1):/dir1/dir4.sizeOfFiles()→ 2 GB (file2 and file6) + no sub-directories./dir1/dir2.sizeOfFiles()→ 1 GB (file3) + no sub-directories (as/dir1/dir2/dir3is detached).
- Total size of
/dir1= 1 GB + 2 GB + 1 GB = 4 GB.
- We would first encounter:
So the total FSO deleted directory space would be: 2 GB (/dir1/dir2/dir3) + 4 GB (/dir1 with detached child: /dir1/dir2/dir3) = 6 GB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks tanvi for the detailed explanation on this!
It makes sense.
2308801 to
c05b31e
Compare
ArafatKhan2198
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your tremendous effort on this PR, Tanvi, and for all the insightful discussions.
There's another bug in the code that was introduced earlier and has persisted. Let me walk you through the issue.
For example, let's consider this case:
Take the following path as an example: /dir1/dir2/f1.txt
Suppose the parent directory dir2 is deleted. In this scenario, the recon component can receive events in two possible orders:
Event 1: recon first receives the deletion event for dir2.
When this happens, dir2 is moved from the directoryTable to the deletedDirectoryTable. It is also unlinked from its parent directory dir1 but remains in the NamespaceSummary, so that our method fetchSizeForDeletedDirectory can still utilize it.
Event 2: After processing the deletion of dir2, recon then receives the deletion event for f1.txt.
When this occurs, f1.txt is moved from the fileTable to the deletedTable.
Now, here's the problem:
At this point, the size of f1.txt is subtracted from the NamespaceSummary object of dir2. This means the sizeOfFiles field for dir2 becomes zero. However, the actual data blocks for f1.txt have not been deleted yet—it still resides in the deletedTable. Therefore, the space has not been reclaimed, and the deleted directory still holds space on disk.
However, when we call the method fetchSizeForDeletedDirectory, it recursively traverses the directory tree and returns the size for the deleted directory dir2 as zero. This is incorrect, because f1.txt still exists in the deletedTable and is consuming disk space.
Conclusion:
This shows that summing up file sizes using only the NamespaceSummary leads to incorrect results. Even fetching the size for a single directory can produce a wrong value if file deletions are not handled correctly.
|
Thank you, @ArafatKhan2198, for the detailed explanation of the bug we encountered in today's code and for actively working on the corresponding ticket: HDDS-13479 and PR: #8836. As discussed offline, let's go ahead and merge the current PR into the feature branch for now. Once the fix has been merged into master, we can rebase the feature branch on top of the latest master to include the changes. |
ChenSammi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy , the last patch LGTM.
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We should not capture deleted files in NSSummary, it will be unlinked. Deleted files can be obtained from deletedTable. |
ArafatKhan2198
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks tanvi!
LGTM +1
What changes were proposed in this pull request?
The PR introduces a
GET/api/v1/keys/deletePending/dirs/size-summaryendpoint to includetotalUnreplicatedDataSizeandtotalReplicatedDataSizefields for tracking the total pending deletion size for FSO directories in OM.totalReplicatedDataSize: Total size of directories pending deletion with expected replicated size.totalUnreplicatedDataSize: Total raw size of directories pending deletion without replicated size.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13188
How was this patch tested?
Manual Testing: