Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stress test failures in Java client #1812

Closed
sreecha opened this issue May 10, 2016 · 8 comments
Closed

Stress test failures in Java client #1812

sreecha opened this issue May 10, 2016 · 8 comments
Assignees
Labels

Comments

@sreecha
Copy link

sreecha commented May 10, 2016

Steps to repro:

(I repro'ed this using a docker image. This may repro without the docker image too - I am not sure)

  1. Built a Docker image for stress test:
    Get the grpc repo (you don't have to build it. Just need it for the docker scripts)
$ git clone https://github.com/grpc/grpc.git
$ cd grpc
$ export INTEROP_IMAGE=stress-test-java
$ export BASE_NAME=grpc_interop_stress_java
$ # Remember to be under the grpc directory
$ tools/run_tests/dockerize/build_interop_stress_image.sh

This will take a few minutes to build (if it fails in the middle, just retry one more time. Sometimes, it is flaky)

  1. Start a docker container
$ docker run -i -t stress-test-java bash
  1. Run java interop server in the background in the container:
root@b44016cf1c83:/# /var/local/git/grpc-java/interop-testing/build/install/grpc-interop-testing/bin/test-server --port=8080 --use_tls=false &
  1. Now run the interop client:
root@b44016cf1c83:/# /var/local/git/grpc-java/interop-testing/build/install/grpc-interop-testing/bin/stresstest-client --test_cases=empty_unary:1 --num_channels_per_server=5 --num_stubs_per_channel=5

You will see the following errors:

root@b44016cf1c83:/# /var/local/git/grpc-java/interop-testing/build/install/grpc-interop-testing/bin/stresstest-client --test_cases=empty_unary:1 --num_channels_per_server=5 --num_stubs_per_channel=5
May 10, 2016 2:04:14 AM io.grpc.internal.ManagedChannelImpl <init>
INFO: [ManagedChannelImpl@7494e528] Created with target localhost:8080
May 10, 2016 2:04:14 AM io.grpc.internal.ManagedChannelImpl <init>
INFO: [ManagedChannelImpl@5f282abb] Created with target localhost:8080
May 10, 2016 2:04:14 AM io.grpc.internal.ManagedChannelImpl <init>
INFO: [ManagedChannelImpl@1786dec2] Created with target localhost:8080
May 10, 2016 2:04:14 AM io.grpc.internal.ManagedChannelImpl <init>
INFO: [ManagedChannelImpl@4d41cee] Created with target localhost:8080
May 10, 2016 2:04:14 AM io.grpc.internal.ManagedChannelImpl <init>
INFO: [ManagedChannelImpl@1b68ddbd] Created with target localhost:8080
[GC (Allocation Failure)  129024K->17912K(493056K), 0.0100072 secs]
[GC (Allocation Failure)  146936K->12744K(622080K), 0.0084358 secs]
[GC (Allocation Failure)  270792K->12672K(622080K), 0.0058763 secs]
[GC (Allocation Failure)  270720K->12664K(880128K), 0.0046501 secs]
[GC (Allocation Failure)  528760K->12704K(880128K), 0.0076892 secs]
[GC (Allocation Failure)  528800K->12744K(1392128K), 0.0086561 secs]
[GC (Allocation Failure)  1044936K->14822K(1392640K), 0.0067372 secs]
[GC (Allocation Failure)  1047014K->14902K(1330176K), 0.0009984 secs]
[GC (Allocation Failure)  1001526K->14982K(1286656K), 0.0008510 secs]
@sreecha sreecha added the bug label May 10, 2016
@carl-mastrangelo
Copy link
Contributor

Those just mean a full GC started. The only thing that message means is that garbage is being created too fast (perhaps due to lots of small requests).

For the stress test we probably need to set Xms and Xmx as well as ParNewGen

@buchgr
Copy link
Collaborator

buchgr commented May 10, 2016

@sreecha @carl-mastrangelo it's actually a minor GC :-), but seems like nothing out of the ordinary? We explicitly turned on GC logging, which is why it's shown here. Pause times seem sufficiently low for us not to worry about them. I think this can be closed?

Only thing that seems strange is that the heap grows without a Full GC. I always thought resizing the heap requires a Full GC.

@buchgr buchgr self-assigned this May 10, 2016
@sreecha
Copy link
Author

sreecha commented May 11, 2016

Ok. I didn't realize these were just ParNew logs.
There are a lot of other exceptions being thrown by the stress clients. For example, I just saw that the most recent run failed with this error:

(Note that this happened when I ran about 10 stress clients each with the following settings:

"num_channels_per_server":5,
"num_stubs_per_channel":10,
"test_cases": "empty_unary:1,large_unary:1,client_streaming:1,server_streaming:1,empty_stream:1",

and 4 of the clients failed. I am pasting the error messages from one of the clients..)

May 11, 2016 3:18:50 PM com.google.common.util.concurrent.AggregateFuture$RunningState handleException
SEVERE: Input Future failed with Error
Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.
        at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
        at io.grpc.testing.integration.StressTestClient$Worker.runTestCase(StressTestClient.java:441)
        at io.grpc.testing.integration.StressTestClient$Worker.run(StressTestClient.java:384)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
May 11, 2016 3:18:50 PM io.grpc.testing.integration.StressTestClient main
WARNING: The stress test client encountered an error!
java.util.concurrent.ExecutionException: Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.
        at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:476)
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:455)
        at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79)
        at io.grpc.testing.integration.StressTestClient.blockUntilStressTestComplete(StressTestClient.java:239)
        at io.grpc.testing.integration.StressTestClient.main(StressTestClient.java:99)
Caused by: Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.

        at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
        at io.grpc.testing.integration.StressTestClient$Worker.runTestCase(StressTestClient.java:441)
        at io.grpc.testing.integration.StressTestClient$Worker.run(StressTestClient.java:384)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

May 11, 2016 3:18:50 PM com.google.common.util.concurrent.AggregateFuture$RunningState handleException
SEVERE: Input Future failed with Error
Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.

        at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
        at io.grpc.testing.integration.StressTestClient$Worker.runTestCase(StressTestClient.java:441)
        at io.grpc.testing.integration.StressTestClient$Worker.run(StressTestClient.java:384)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

May 11, 2016 3:18:50 PM com.google.common.util.concurrent.AggregateFuture$RunningState handleException
SEVERE: Input Future failed with Error
Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.

        at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
        at io.grpc.testing.integration.StressTestClient$Worker.runTestCase(StressTestClient.java:441)
        at io.grpc.testing.integration.StressTestClient$Worker.run(StressTestClient.java:384)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

May 11, 2016 3:18:50 PM com.google.common.util.concurrent.AggregateFuture$RunningState handleException
SEVERE: Input Future failed with Error
Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.

        at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
        at io.grpc.testing.integration.StressTestClient$Worker.runTestCase(StressTestClient.java:441)
        at io.grpc.testing.integration.StressTestClient$Worker.run(StressTestClient.java:384)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

May 11, 2016 3:18:50 PM com.google.common.util.concurrent.AggregateFuture$RunningState handleException
SEVERE: Input Future failed with Error
Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.

        at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
        at io.grpc.testing.integration.StressTestClient$Worker.runTestCase(StressTestClient.java:441)
        at io.grpc.testing.integration.StressTestClient$Worker.run(StressTestClient.java:384)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

May 11, 2016 3:18:51 PM io.grpc.internal.ManagedChannelImpl maybeTerminateChannel
INFO: [ManagedChannelImpl@4232c52b] Terminated
May 11, 2016 3:18:51 PM io.grpc.internal.ManagedChannelImpl maybeTerminateChannel
INFO: [ManagedChannelImpl@2bbf4b8b] Terminated
May 11, 2016 3:18:51 PM io.grpc.internal.ManagedChannelImpl maybeTerminateChannel
INFO: [ManagedChannelImpl@d44fc21] Terminated
May 11, 2016 3:18:52 PM io.grpc.internal.ManagedChannelImpl maybeTerminateChannel
INFO: [ManagedChannelImpl@2f943d71] Terminated
May 11, 2016 3:18:52 PM io.grpc.internal.ManagedChannelImpl maybeTerminateChannel
INFO: [ManagedChannelImpl@467aecef] Terminated

@sreecha
Copy link
Author

sreecha commented May 11, 2016

I think it would be a lot easier if you can try to repro this by running the stress test framework yourself.

There is a one-time setup you need to do (to set up a google cloud platform account and install some python libraries on your machine) but I think it is totally worth it to debug any future stress-related issues

The instructions are here:
https://github.com/grpc/grpc/blob/master/tools/run_tests/stress_test/README.md

@sreecha
Copy link
Author

sreecha commented May 11, 2016

@buchgr, I just realized that you may not have a Google cloud platform account - so please ignore my previous comment for now. I'll chat with the java team here and get back to you.. thanks

@carl-mastrangelo
Copy link
Contributor

@buchgr Ideally the heap will never change size. All the advice internally is to lock NewSize and MaxNewSize to the same value, and also lock Xmx and Xms.

@buchgr
Copy link
Collaborator

buchgr commented May 13, 2016

@sreecha I believe the error messages are due to timeouts. See, the stress test client reuses the interop integration tests under hood, and these tests verify that e.g. a call ends within 5 seconds. That's where the error seems to come from.

Wanted but not invoked:
streamObserver.onCompleted();
-> at io.grpc.testing.integration.AbstractInteropTest.emptyStream(AbstractInteropTest.java:340)
Actually, there were zero interactions with this mock.

In your opinion, is it possible that for some calls the latency is > 5 seconds?

Also, yes I don't have a GCE account. I used to have one. Who should I talk to, to get one? I am happy to set up a test env.

In the meantime, I ll submit a PR to refactor the interop tests to exlude these timeouts when using the stress test client.

buchgr added a commit to buchgr/grpc-java that referenced this issue May 13, 2016
@buchgr buchgr closed this as completed in 46edcad May 13, 2016
@buchgr
Copy link
Collaborator

buchgr commented May 13, 2016

@sreecha can you retry with current master please?

@lock lock bot locked as resolved and limited conversation to collaborators Sep 22, 2018
fixmebot bot referenced this issue in aomsw13/develop_test Apr 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants