Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -676,6 +676,29 @@ private void getPendingForDeletionDirInfo(
}
}

private void calculateTotalPendingDeletedDirSizes(Map<String, Long> dirSummary) {
long totalDataSize = 0L;
long totalReplicatedDataSize = 0L;

Table<String, OmKeyInfo> deletedDirTable = omMetadataManager.getDeletedDirTable();
try (TableIterator<String, ? extends Table.KeyValue<String, OmKeyInfo>> iterator = deletedDirTable.iterator()) {
while (iterator.hasNext()) {
Table.KeyValue<String, OmKeyInfo> kv = iterator.next();
OmKeyInfo omKeyInfo = kv.getValue();
if (omKeyInfo != null) {
Comment on lines +685 to +688
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @tanvipenumudy
Currently, we iterate through the DeletedDirectoryTable, calculate the size of each deleted directory, and then find the sum total size. The issue arises when both the parent and child entries are encountered in the DeletedDirectoryTable, resulting in the same size being calculated twice.

Are we going to consider this scenario?
Or is this scenario even possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ArafatKhan2198 for the review, yes for the FSO deleted directory space calculation - it is indeed possible that both the parent and child directory entries might coexist within the deletedDirectoryTable, but this wouldn't result in double counting of the child directory's size because of how Recon processes delete events today, let's consider the below two scenarios for example:

Case 1: /dir1/dir2, if dir1 (parent) is deleted first -> then /dir1 and /dir1/dir2 wouldn't coexist in the deletedDirectoryTable at any given point in time (the directories would land in the deletedDirectoryTable in order from top to bottom) -> this shouldn't cause any impact.

Case 2: /dir1/dir2, if /dir1/dir2 is deleted first following which /dir1 is deleted, then both /dir1/dir2 and /dir1 can coexist in the deletedDirectoryTable.

  • This wouldn't double-calculate the size of /dir1/dir2 when both /dir1/dir2 and /dir1 are in the deletedDirectoryTable because Recon detaches the /dir1/dir2 (child directory) from its parent (/dir1) in the NSSummaryTree as and when it sees this delete directory event.

  • Our approach would iterate over Recon's deletedDirectoryTable entries and recursively calculate the sizes of the entry's sub-directories:

    • We would first encounter /dir1/dir2 in the deletedDirectoryTable, recursively calculate the sizes of directories under it.
    • We would then encounter /dir1 in the deletedDirectoryTable, but since the child tree is detached -> we would only be recursively calculating the sizes of its existing sub-directories.

Copy link
Contributor Author

@tanvipenumudy tanvipenumudy Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking another example for better clarity, consider the following directory structure:

- dir1/
  - file1 (1 GB)
  - dir4/
    - file2 (1 GB)
    - file6 (1 GB)
  - dir2/
    - file3 (1 GB)
    - dir3/
      - file4 (1 GB)
      - file5 (1 GB)

Suppose /dir1/dir2/dir3 is deleted first, followed by the deletion of /dir1:

  • First, the link between /dir1/dir2 and /dir1/dir2/dir3 would be detached.
  • As per our approach, we would iterate over the deletedDirectoryTable entries:
    • We would first encounter: /dir1/dir2/dir3 in deletedDirectoryTable
      /dir1/dir2/dir3.getSizeOfAllFiles() = 2 GB (file4 and file5)
    • We would then encounter: /dir1 in deletedDirectoryTable
      → size of /dir1 would be calculated as:
      • /dir1.sizeOfAllFiles()1 GB (file1) + recursiveSizeOfSubDirs(/dir1):
        • /dir1/dir4.sizeOfFiles()2 GB (file2 and file6) + no sub-directories.
        • /dir1/dir2.sizeOfFiles()1 GB (file3) + no sub-directories (as /dir1/dir2/dir3 is detached).
      • Total size of /dir1 = 1 GB + 2 GB + 1 GB = 4 GB.

So the total FSO deleted directory space would be: 2 GB (/dir1/dir2/dir3) + 4 GB (/dir1 with detached child: /dir1/dir2/dir3) = 6 GB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks tanvi for the detailed explanation on this!
It makes sense.

Pair<Long, Long> sizeInfo = fetchSizeForDeletedDirectory(omKeyInfo.getObjectID());
totalDataSize += sizeInfo.getLeft();
totalReplicatedDataSize += sizeInfo.getRight();
}
}
} catch (IOException ex) {
throw new WebApplicationException(ex, Response.Status.INTERNAL_SERVER_ERROR);
}

dirSummary.put("totalDataSize", totalDataSize);
dirSummary.put("totalReplicatedDataSize", totalReplicatedDataSize);
}

/**
* Given an object ID, return total data size as a pair of Total Size, Total Replicated Size
* under this object. Note:- This method is RECURSIVE.
Expand Down Expand Up @@ -790,6 +813,28 @@ public Response getDeletedDirectorySummary() {
return Response.ok(dirSummary).build();
}

/**
* Retrieves the summary of the total delete pending directory size (unreplicated and replicated).
*
* @return The HTTP response body includes a map with the following entries:
* - "totalDataSize": the total replicated size of delete pending directories.
* - "totalReplicatedDataSize": the total unreplicated size of delete pending directories.
*
* Example response:
* {
* "totalDataSize": 30000,
* "totalReplicatedDataSize": 90000
* }
*/
@GET
@Path("/deletePending/dirs/size-summary")
public Response getTotalDeletedDirectorySizeSummary() {
Map<String, Long> dirSummary = new HashMap<>();
// Create a keys summary for deleted directories
calculateTotalPendingDeletedDirSizes(dirSummary);
return Response.ok(dirSummary).build();
}

/**
* This API will list out limited 'count' number of keys after applying below filters in API parameters:
* Default Values of API param filters:
Expand Down