Skip to content

HDDS-1384. TestBlockOutputStreamWithFailures is failing#1029

Closed
elek wants to merge 3 commits intoapache:trunkfrom
elek:HDDS-1384
Closed

HDDS-1384. TestBlockOutputStreamWithFailures is failing#1029
elek wants to merge 3 commits intoapache:trunkfrom
elek:HDDS-1384

Conversation

@elek
Copy link
Copy Markdown
Member

@elek elek commented Jun 28, 2019

TestBlockOutputStreamWithFailures is failing with the following error

{noformat}
2019-04-04 18:52:43,240 INFO volume.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(140)) - Scheduling a check for org.apache.hadoop.ozone.container.common.volume.HddsVolume@1f6c0e8a
2019-04-04 18:52:43,240 INFO volume.HddsVolumeChecker (HddsVolumeChecker.java:checkAllVolumes(203)) - Scheduled health check for volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@1f6c0e8a
2019-04-04 18:52:43,241 ERROR server.GrpcService (ExitUtils.java:terminate(133)) - Terminating with exit status 1: Failed to start Grpc server
java.io.IOException: Failed to bind
at org.apache.ratis.thirdparty.io.grpc.netty.NettyServer.start(NettyServer.java:253)
at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:166)
at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:81)
at org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:144)
at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:202)
at org.apache.ratis.server.impl.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:69)
at org.apache.ratis.server.impl.RaftServerProxy.lambda$start$3(RaftServerProxy.java:300)
at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:202)
at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:298)
at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:419)
at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:186)
at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:169)
at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:338)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at org.apache.ratis.thirdparty.io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:130)
at org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
at org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1358)
at org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
at org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
at org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:1019)
at org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
at org.apache.ratis.thirdparty.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:366)
at org.apache.ratis.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
at org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
{noformat}

See: https://issues.apache.org/jira/browse/HDDS-1384

@elek elek added the ozone label Jun 28, 2019
@hadoop-yetus
Copy link
Copy Markdown

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 32 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 475 trunk passed
+1 compile 260 trunk passed
+1 checkstyle 58 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 800 branch has no errors when building and testing our client artifacts.
+1 javadoc 155 trunk passed
0 spotbugs 318 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 514 trunk passed
_ Patch Compile Tests _
+1 mvninstall 455 the patch passed
+1 compile 276 the patch passed
+1 javac 275 the patch passed
-0 checkstyle 41 hadoop-hdds: The patch generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0)
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 680 patch has no errors when building and testing our client artifacts.
+1 javadoc 167 the patch passed
+1 findbugs 523 the patch passed
_ Other Tests _
+1 unit 272 hadoop-hdds in the patch passed.
-1 unit 1636 hadoop-ozone in the patch failed.
+1 asflicense 41 The patch does not generate ASF License warnings.
6618
Reason Tests
Failed junit tests hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
hadoop.ozone.client.rpc.TestOzoneRpcClient
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion
hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
hadoop.ozone.TestMiniOzoneCluster
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/1/artifact/out/Dockerfile
GITHUB PR #1029
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 541e760451b8 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / f09c31a
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/1/artifact/out/diff-checkstyle-hadoop-hdds.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/1/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/1/testReport/
Max. process+thread count 5403 (vs. ulimit of 5500)
modules C: hadoop-hdds/container-service U: hadoop-hdds/container-service
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@arp7
Copy link
Copy Markdown
Contributor

arp7 commented Jul 1, 2019

+1 the patch lgtm.

The unit test failures may be related. Thanks for taking this up Marton!

@hadoop-yetus
Copy link
Copy Markdown

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 109 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
0 mvndep 69 Maven dependency ordering for branch
+1 mvninstall 543 trunk passed
+1 compile 265 trunk passed
+1 checkstyle 71 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 890 branch has no errors when building and testing our client artifacts.
+1 javadoc 153 trunk passed
0 spotbugs 307 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 494 trunk passed
_ Patch Compile Tests _
0 mvndep 29 Maven dependency ordering for patch
+1 mvninstall 429 the patch passed
+1 compile 249 the patch passed
+1 javac 249 the patch passed
-0 checkstyle 36 hadoop-hdds: The patch generated 7 new + 0 unchanged - 0 fixed = 7 total (was 0)
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 704 patch has no errors when building and testing our client artifacts.
+1 javadoc 153 the patch passed
+1 findbugs 509 the patch passed
_ Other Tests _
+1 unit 292 hadoop-hdds in the patch passed.
-1 unit 1560 hadoop-ozone in the patch failed.
+1 asflicense 42 The patch does not generate ASF License warnings.
6804
Reason Tests
Failed junit tests hadoop.ozone.client.rpc.TestOzoneRpcClient
hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis
hadoop.ozone.client.rpc.TestBCSID
hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
hadoop.ozone.client.rpc.TestOzoneAtRestEncryption
hadoop.ozone.client.rpc.TestCommitWatcher
Subsystem Report/Notes
Docker Client=18.09.5 Server=18.09.5 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/2/artifact/out/Dockerfile
GITHUB PR #1029
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux fef3e86339d6 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 91cc197
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/2/artifact/out/diff-checkstyle-hadoop-hdds.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/2/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/2/testReport/
Max. process+thread count 5407 (vs. ulimit of 5500)
modules C: hadoop-hdds/container-service hadoop-ozone/integration-test U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1029/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@elek
Copy link
Copy Markdown
Member Author

elek commented Jul 12, 2019

Thanks @arp7 the review, I am merging it to the trunk right now.
Remaining unit test failures are not related (AssertionErrors + timeout) the original problem was fixed (
44a8b9f)

@elek elek closed this in 9119ed0 Jul 12, 2019
asfgit pushed a commit that referenced this pull request Jul 15, 2019
amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants