-
Notifications
You must be signed in to change notification settings - Fork 587
HDDS-13180. Add replicatedSizeOfFiles to NSSummary #8568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
devabhishekpal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy for this patch.
I think this might need an upgrade handler in case the user is upgrading Recon.
Currently we do not have a handler in the traditional way like it is present for other DerbyDB tables.
In this case you would have to trigger a fresh build of NSSummary after upgrade to ensure that the new changes are reflected.
@ArafatKhan2198 could you help out with how this can be done?
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Show resolved
Hide resolved
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Outdated
Show resolved
Hide resolved
devabhishekpal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tanvipenumudy, overall the patch looks good to me.
Added a few minor comments.
The only thing missing right now is the upgrade handler for this patch.
@ArafatKhan2198 @devmadhuu could you help with taking a look at this patch as well?
Thanks
|
|
|
Thanks for the changes @tanvipenumudy. Could you add tests for the upgrade handler as well? |
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/OMDBInsightEndpoint.java
Outdated
Show resolved
Hide resolved
| ReconNamespaceSummaryManager nsSummaryManager = injector.getInstance(ReconNamespaceSummaryManager.class); | ||
| ReconOMMetadataManager omMetadataManager = injector.getInstance(ReconOMMetadataManager.class); | ||
| LOG.info("Starting full rebuild of NSSummary for REPLICATED_SIZE_OF_FILES upgrade..."); | ||
| nsSummaryManager.rebuildNSSummaryTree(omMetadataManager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanvipenumudy , what's the estimation time of this rebuild process for a Recon DB, size similar to some of user bigger users?
If the upgrade will take hours, is it accepted? @devmadhuu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe something similar was done (tree rebuild process was triggered) when the parentId field was newly introduced (PR: #6492). Not sure if we have any existing numbers on this cc: @ArafatKhan2198.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per an offline discussion with @ArafatKhan2198, please find the stats below based on a previous benchmark:
- Total Data: 1.347 TB
- Directories: 5.7 million
- Keys: 8.7 million
- Tree Depth: 5-10 levels
The total time it took to (re)build the NSSummary tree was ~5 mins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy for the info. The Recon DB is 1.347TB for 5.7m dirs and 8.7m keys? The DB size is just beyond the expectation.
Currently "nsSummaryManager.rebuildNSSummaryTree(omMetadataManager)" is a blocking call. It returns when the rebuild finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about if we trigger rebuild as an async operation because we have plans to show the status update of recon tasks on Overview page if they are in progress. This may ease upgrade process efficient and quicker...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @devmadhuu, as discussed this change can be part of HDDS-13443!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Recon DB is 1.347TB for 5.7m dirs and 8.7m keys? The DB size is just beyond the expectation.
@ChenSammi, I believe this is the total cluster capacity (please correct me if I'm wrong @ArafatKhan2198).
"nsSummaryManager.rebuildNSSummaryTree(omMetadataManager)" blocks hours during Recon update, will there be any impact to Recon's other part functionality?
Thank you @devmadhuu and @ArafatKhan2198 for clarifying the second half of the question!
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestLeaseRecovery.java
Outdated
Show resolved
Hide resolved
a13752e to
dc6ae02
Compare
devmadhuu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanvipenumudy , can you check a comment pls ?
| ReconNamespaceSummaryManager nsSummaryManager = injector.getInstance(ReconNamespaceSummaryManager.class); | ||
| ReconOMMetadataManager omMetadataManager = injector.getInstance(ReconOMMetadataManager.class); | ||
| LOG.info("Starting full rebuild of NSSummary for REPLICATED_SIZE_OF_FILES upgrade..."); | ||
| nsSummaryManager.rebuildNSSummaryTree(omMetadataManager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about if we trigger rebuild as an async operation because we have plans to show the status update of recon tasks on Overview page if they are in progress. This may ease upgrade process efficient and quicker...
Thank you @devmadhuu for the review. If we look at the |
ArafatKhan2198
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
Thanks a lot for working on this @tanvipenumudy.
|
Hi @ArafatKhan2198 , @devmadhuu do you have any idea if "nsSummaryManager.rebuildNSSummaryTree(omMetadataManager)" blocks hours during Recon update, will there be any impact to Recon's other part functionality? |
@ChenSammi |
...on/src/test/java/org/apache/hadoop/ozone/recon/TestReconInsightsForDeletedDirectoriesEC.java
Outdated
Show resolved
Hide resolved
Quick Facts about the DB I recently tested at :
|
devabhishekpal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy.
This LGTM, +1
|
Thank you @devabhishekpal, @ChenSammi, @devmadhuu, @ArafatKhan2198, @adoroszlai and @sumitagrawl for reviewing the patch! |
What changes were proposed in this pull request?
replicatedSizeOfFilesfield to NSSummary to track the total expected replicated size of a directory's sub-files.PUTandDELETEkey events./keys/deletePending/dirsis also updated to include the replicated size information.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13180
How was this patch tested?
Manual Testing: