-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13445. Make ozone debug replicas chunk-info stream json output between datanode calls
#8914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@Gargi-jais11 the PR description does not seem to match the actual change, and the branch references HDDS-13445, not HDDS-12998 |
ozone debug replicas chunk-info stream json output between datanode calls
So sorry. I have changed PR description. |
...-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/replicas/chunk/ChunkKeyHandler.java
Outdated
Show resolved
Hide resolved
7dc7c36 to
03eaad3
Compare
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement. Mostly looks good. I tested it manually as well. Just two minor comments.
...-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/replicas/chunk/ChunkKeyHandler.java
Show resolved
Hide resolved
sarvekshayr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally and verified the streaming output. Other than Ethan’s comments, overall LGTM.
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks @Gargi-jais11 for the patch, and @errose28 @sarvekshayr for the reviews |
What changes were proposed in this pull request?
ozone debug replicas chunk-info prints chunk information from all replicas for all chunks of all blocks within a file. It gathers all the information from the datanodes, stores it in memory, then prints it all at once. For large files in the GB range, this could result in a large amount of information stored in the client memory before printing. It would be better to print information about one block at a time in between each getBlock call to the datanode. The Json structure can remain the same.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13445
How was this patch tested?
ran manually on docker-cluster