Skip to content

Conversation

@ArafatKhan2198
Copy link
Contributor

What changes were proposed in this pull request?

When a directory and its files are deleted, fetchSizeForDeletedDirectory returns incorrect size calculations because it relies solely on NamespaceSummary data, which doesn't account for files that still exist in deletedTable.

Steps to Reproduce:

1. Create directory structure: /dir1/dir2/f1.txt
2. Delete dir2 (Event 1: directory deletion)
3. Delete f1.txt (Event 2: file deletion)
4. Call fetchSizeForDeletedDirectory for dir2

Expected: Returns actual disk space consumed by f1.txt in deletedTable
Actual: Returns 0 because NamespaceSummary.sizeOfFiles was subtracted

Root Cause: NamespaceSummary tracks deleted files by subtracting their size, but deletedTable still contains the actual data blocks. fetchSizeForDeletedDirectory only considers NamespaceSummary data, leading to incorrect size calculations.

Impact: Incorrect space reclamation reports and misleading storage analytics.

This pull request updates the Recon NamespaceSummary logic to correctly account for deleted files and directories, ensuring accurate size reporting for deleted directories.

Key code changes include:

  • NamespaceSummary Model & Codec:
    • Extended the NSSummary class and its codec to track the number and size of deleted files and directories, as well as the set of deleted child directories.
    • Ensured that serialization and deserialization of NSSummary objects now include these new deleted-tracking fields, with backward compatibility for older data.
  • Event Handling Logic:
    • Updated the event processing in NSSummaryTaskWithFSO to properly update the deleted file and directory tracking fields in NSSummary when deletion events are received.
    • When a file is deleted, its size and count are now added to the deleted tracking fields of the parent directory’s NSSummary, rather than simply subtracting from the active file size/count.
    • When a directory is deleted, its object ID is added to the parent’s deletedChildDir set, and the deleted directory count is incremented.
  • Size Calculation Consistency:
    • Ensured that methods which fetch the size for deleted directories (such as fetchSizeForDeletedDirectory) now consider both the deleted files and directories, preventing under-reporting of disk usage when files are deleted after their parent directory.
  • Backward Compatibility:
    • The codec and event handling logic are designed to be backward compatible, so existing data and event flows will not break.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13479

How was this patch tested?

Tested out manually

@sumitagrawl
Copy link
Contributor

We should have have this tracking....
Anyway its tracked as part of size in deletedTable, so this is redundant and complex.

We can close this JIRA and PR.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need have this tracking

@github-actions
Copy link

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

@github-actions github-actions bot added the stale label Nov 11, 2025
@github-actions
Copy link

Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it.

@github-actions github-actions bot closed this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants