-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13432. Accelerating Namespace Usage Calculation in Recon using - Aggregation Approach #8797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-13432. Accelerating Namespace Usage Calculation in Recon using - Aggregation Approach #8797
Conversation
… Materialised Approach.
cdd2d20 to
d82c5c9
Compare
devmadhuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArafatKhan2198 Thanks for the patch. Pls check few comments.
...one/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTaskDbEventHandler.java
Outdated
Show resolved
Hide resolved
...one/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTaskDbEventHandler.java
Outdated
Show resolved
Hide resolved
Is this true. I believe this is true even for diskUsage(Namespace usage) page as well. |
|
Alos pls resolve conflicts. |
|
I found a bug in my implementation code that handles directory deletions. When a directory is deleted, the system correctly updates the parent directory's total file count and size, but it fails to accurately update the file size distribution histogram. This leads to incorrect information about the size categories of files in the parent directory. Simple ExampleImagine a directory
The file size distribution for When I have applied a fix to ensures that when a directory is deleted, the parent directory's file size distribution is correctly updated by subtracting the specific file size distribution of the deleted directory. After the fix, deleting I've thoroughly tested this fix with unit tests for both file and directory deletion scenarios to ensure everything works correctly. |
...one/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTaskDbEventHandler.java
Outdated
Show resolved
Hide resolved
|
Hi @sumitagrawl and @devmadhuu, I have reverted the changes. We are no longer focusing on the File Bucket Distribution for now. We can consider implementing and tracking it later through this HDDS-13539 after further discussions. For the time being, please review the current state so we can proceed with the implementation. Materialized Size/Count Optimization - Event Handling RulesEvery change must update the immediate parent AND propagate to ALL ancestors up to the root. This ensures every directory knows its total size/count including all subdirectories. File AdditionWhat happens: New file created Action:
File DeletionWhat happens: File removed Action:
Directory AdditionWhat happens: Directory explicitly created (might already exist with content) Action:
Directory DeletionWhat happens: Directory and all contents removed Action:
|
9d8c33b to
2c64f4e
Compare
f1279aa to
2c64f4e
Compare
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArafatKhan2198 Thanks for working over this, given few comments
...one/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTaskDbEventHandler.java
Show resolved
Hide resolved
...one/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTaskDbEventHandler.java
Outdated
Show resolved
Hide resolved
...one/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/NSSummaryTaskDbEventHandler.java
Show resolved
Hide resolved
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArafatKhan2198 Thanks for working over this, given few comments
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
devmadhuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ArafatKhan2198 for the improved changes. LGTM now +1
* master: (55 commits) HDDS-13525. Rename configuration property to ozone.om.compaction.service.enabled (apache#8928) HDDS-13519. Reconciliation should continue if a peer datanode is unreachable (apache#8908) HDDS-13566. Fix incorrect authorizer class in ACL documentation (apache#8931) HDDS-13084. Trigger on-demand container scan when a container moves from open to unhealthy. (apache#8904) HDDS-13432. Accelerating Namespace Usage Calculation in Recon using - Materialised Approach (apache#8797) HDDS-13557. Bump jline to 3.30.5 (apache#8920) HDDS-13556. Bump assertj-core to 3.27.4 (apache#8919) HDDS-13543. [Docs] Design doc for OM bootstrapping process with snapshots. (apache#8900) HDDS-13541. Bump sonar-maven-plugin to 5.1.0.4751 (apache#8911) HDDS-13101. Remove duplicate information in datanode list output (apache#8523) HDDS-13528. Handle null paths when the NSSummary is initializing (apache#8901) HDDS-12990. (addendum) Generate tree from metadata when it does not exist during getContainerChecksumInfo call (apache#8881) HDDS-13086. Block duplicate reconciliation requests for the same container and datanode within the datanode. (apache#8905) HDDS-12990. Generate tree from metadata when it doesn't exist during getContainerChecksumInfo call (apache#8881) HDDS-12824. Optimize container checksum read during datanode startup (apache#8604) HDDS-13522. Rename axisLabel for No. of delete request received (apache#8879) HDDS-12196. Document ozone repair cli (apache#8849) HDDS-13514. Intermittent failure in TestNSSummaryMemoryLeak (apache#8889) HDDS-13423. Log reason for triggering on-demand container scan (apache#8854) HDDS-13466. Disable flaky TestOmSnapshotFsoWithNativeLibWithLinkedBuckets ...
What changes were proposed in this pull request?
This pull request introduces a "materialized view" approach for Recon's Namespace Summary (
NSSummary), significantly improving the speed of namespace usage calculations.Approach and Implementation Details:
NSSummary:numOfFilesandsizeOfFilesfields inorg.apache.hadoop.ozone.recon.api.types.NSSummarynow store the total number and size of files within a directory and all its subdirectories, not just direct files. This enables constant-time (O(1)) queries for aggregate usage.org.apache.hadoop.ozone.recon.tasks.NSSummaryTaskDbEventHandleris enhanced to immediately propagate changes:NSSummarytotals.NSSummaryis unlinked (itsparentIdset to0L), and its total file count and size (at the moment of deletion) are decremented from its parent and all ancestors. The deleted directory'sNSSummaryitself retains its historical totals.org.apache.hadoop.ozone.recon.api.handlers.EntityHandlernow directly uses the pre-calculatednumOfFilesandsizeOfFilesfromNSSummaryfor total counts and sizes, eliminating the need for recursive tree traversals at query time.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13432
How was this patch tested?
Manually verified the changes, also added Unit Tests to verify the changes.