-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HDFS-15621. Datanode DirectoryScanner uses excessive memory #2849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
The spotbugs warning looks like a false positive to me. |
jojochuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch looks reasonable to me. Thanks for working on this @sodonnel !
Just a little suggestion for javadoc
| * | ||
| * @param blockId the block ID | ||
| * @param blockFile the path to the block data file | ||
| * @param metaFile the path to the block meta-data file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a @param for basePath. Also add that metaFile stores only the suffix.
.../hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java
Show resolved
Hide resolved
|
@jojochuang Thanks for the review - I have pushed a new commit with the change. Please have a look when you get a chance. |
|
💔 -1 overall
This message was automatically generated. |
jojochuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
…Contributed by Stephen O'Donnell (cherry picked from commit 605ed85) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java
…Contributed by Stephen O'Donnell (cherry picked from commit 605ed85) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java (cherry picked from commit f6efb58) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java
…Contributed by Stephen O'Donnell (cherry picked from commit 605ed85) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java (cherry picked from commit f6efb58) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java (cherry picked from commit 7a81e50)
). Contributed by Stephen O'Donnell (cherry picked from commit 605ed85) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java (cherry picked from commit f6efb58) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java (cherry picked from commit 7a81e50) (cherry picked from commit 71a9885) Change-Id: I0974bdb6d05f8bd7741071b855ffce4389be969a (cherry picked from commit 97cfe8d)
This is a relatively simple change to reduce the memory used by the Directory Scanner and also simplify the logic in the ScanInfo object.
This change ensures the same File object is re-used for all blocks in a directory. Previously a large part of the path was repeated for each block file.
Aside from that, the logic of the directory scanner remains the same.
Comparing heap dumps, the memory used by 100K blocks goes from ~35MB to 19MB. Or 350MB per 1M blocks down to 190MB per 1M blocks. This is a reduction of about 46%.