Skip to content

Comments

HDFS-16610. Make fsck read timeout configurable#4384

Merged
sodonnel merged 2 commits intoapache:trunkfrom
sodonnel:HDFS-16610
Jun 1, 2022
Merged

HDFS-16610. Make fsck read timeout configurable#4384
sodonnel merged 2 commits intoapache:trunkfrom
sodonnel:HDFS-16610

Conversation

@sodonnel
Copy link
Contributor

Description of PR

In a cluster with a lot of small files, we encountered a case where fsck was very slow. I believe it is due to contention with many other threads reading / writing data on the cluster.

Sometimes fsck does not report any progress for more than 60 seconds and the client times out. Currently the connect and read timeout are hardcoded to 60 seconds. This change is to make them configurable.

How was this patch tested?

Tested manually by inserting a sleep into the fsck logic in the NN. I then adjusted the read timeout to validate I got a timeout or not depending on the timeout setting.

@sodonnel sodonnel requested a review from szetszwo May 31, 2022 12:01
Comment on lines 276 to 282
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to restrict the config to ms? Can keep only dfs.client.fsck.connect.timeout similar to dfs.webhdfs.socket.connect-timeout and allow timeunits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion makes sense. I have changed it to use the same technique as with webhdfs.

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if the build has no complains...

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 48s Maven dependency ordering for branch
+1 💚 mvninstall 24m 47s trunk passed
-1 ❌ compile 2m 24s /branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt hadoop-hdfs-project in trunk failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.
-1 ❌ compile 2m 17s /branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt hadoop-hdfs-project in trunk failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.
+1 💚 checkstyle 1m 36s trunk passed
+1 💚 mvnsite 3m 2s trunk passed
+1 💚 javadoc 2m 22s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 2m 44s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 24s trunk passed
+1 💚 shadedclient 23m 1s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 2m 22s the patch passed
-1 ❌ compile 2m 11s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt hadoop-hdfs-project in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.
-1 ❌ javac 2m 11s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt hadoop-hdfs-project in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.
-1 ❌ compile 2m 0s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt hadoop-hdfs-project in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.
-1 ❌ javac 2m 0s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt hadoop-hdfs-project in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 14s the patch passed
+1 💚 mvnsite 2m 23s the patch passed
+1 💚 javadoc 1m 42s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 2m 13s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 5s the patch passed
+1 💚 shadedclient 22m 27s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 38s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 253m 58s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 10s The patch does not generate ASF License warnings.
383m 48s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/Dockerfile
GITHUB PR #4384
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 1c7bf3cc457d 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 60f2dd691b5106cdec358315f2a58d217530d691
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/testReport/
Max. process+thread count 3565 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 47s Maven dependency ordering for branch
+1 💚 mvninstall 24m 49s trunk passed
-1 ❌ compile 2m 32s /branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt hadoop-hdfs-project in trunk failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.
-1 ❌ compile 2m 17s /branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt hadoop-hdfs-project in trunk failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.
+1 💚 checkstyle 1m 34s trunk passed
+1 💚 mvnsite 3m 3s trunk passed
+1 💚 javadoc 2m 22s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 2m 38s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 35s trunk passed
+1 💚 shadedclient 22m 52s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 2m 21s the patch passed
-1 ❌ compile 2m 20s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt hadoop-hdfs-project in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.
-1 ❌ javac 2m 20s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt hadoop-hdfs-project in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.
-1 ❌ compile 1m 57s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt hadoop-hdfs-project in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.
-1 ❌ javac 1m 57s /patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt hadoop-hdfs-project in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 18s the patch passed
+1 💚 mvnsite 2m 32s the patch passed
+1 💚 javadoc 1m 46s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 2m 14s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 26s the patch passed
+1 💚 shadedclient 22m 29s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 36s hadoop-hdfs-client in the patch passed.
+1 💚 unit 250m 56s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 13s The patch does not generate ASF License warnings.
380m 59s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/Dockerfile
GITHUB PR #4384
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux f5ef69bed398 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 9b7bd1d6535089234ffb2be3665e7ec26e5b9734
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/testReport/
Max. process+thread count 3041 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@sodonnel
Copy link
Contributor Author

The native client is failing to build, but I cannot see how this change could cause that. I wonder if there is something else going on, or some other change has broken the build recently?

[INFO] Reactor Summary for Apache Hadoop HDFS Project 3.4.0-SNAPSHOT:
[INFO] 
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 34.048 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [ 57.213 s]
[INFO] Apache Hadoop HDFS Native Client ................... FAILURE [  7.360 s]

@ayushtkn
Copy link
Member

ayushtkn commented Jun 1, 2022

May be due to HDFS-16604, try to run the build again...

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good. Thanks, @sodonnel .

@sodonnel
Copy link
Contributor Author

sodonnel commented Jun 1, 2022

I rebased against the latest trunk and force pushed it. Lets see if we get a better build this time.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 9s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 16s Maven dependency ordering for branch
+1 💚 mvninstall 27m 52s trunk passed
+1 💚 compile 7m 2s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 6m 22s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 34s trunk passed
+1 💚 mvnsite 2m 49s trunk passed
+1 💚 javadoc 2m 12s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 2m 28s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 33s trunk passed
+1 💚 shadedclient 26m 15s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 27s Maven dependency ordering for patch
+1 💚 mvninstall 2m 35s the patch passed
+1 💚 compile 7m 39s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 7m 39s the patch passed
+1 💚 compile 7m 27s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 7m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 26s the patch passed
+1 💚 mvnsite 2m 26s the patch passed
+1 💚 javadoc 1m 50s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 2m 9s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 38s the patch passed
+1 💚 shadedclient 26m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 28s hadoop-hdfs-client in the patch passed.
+1 💚 unit 331m 58s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
490m 17s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/3/artifact/out/Dockerfile
GITHUB PR #4384
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux bd88f84c1b08 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0ecf659
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/3/testReport/
Max. process+thread count 2566 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@sodonnel
Copy link
Contributor Author

sodonnel commented Jun 1, 2022

Good build this time, so I will commit. Thanks all for the reviews.

@sodonnel sodonnel merged commit 34a973a into apache:trunk Jun 1, 2022
asfgit pushed a commit that referenced this pull request Jun 1, 2022
(cherry picked from commit 34a973a)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
asfgit pushed a commit that referenced this pull request Jun 1, 2022
(cherry picked from commit 34a973a)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml

(cherry picked from commit 7d6b133)
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
)

(cherry picked from commit 34a973a)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml

(cherry picked from commit 7d6b133)
(cherry picked from commit 58513d3)
(cherry picked from commit 6737281)
Change-Id: I2c64b0dfa7a5e55ced95fa7ddb81fcb0dc6bc171
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants