[thrift][bugfix]try to fix thrift request overflow crash.#12890
[thrift][bugfix]try to fix thrift request overflow crash.#12890zuercher merged 7 commits intoenvoyproxy:masterfrom pyrl247:master
Conversation
PoolFailure will set host to null,cause the crush in this case. Signed-off-by: Guang Yang <pyrl247@gmail.com>
|
Thanks. You’ll need to fix the code formatting (if you click through to the details of the failed build you can eventually drill down to the log file that shows what’s formatted incorrectly, or you can install clang-format and use the tools/code_format/check_format.py directly). Also, let’s add a test to validate the fix. |
Signed-off-by: Guang Yang <pyrl247@gmail.com>
|
I didn't found where to add the test case,Could you please tell me where to add it ? |
|
For the test case, I think you can probably use this existing one as an example: and add something like: at the end... (which should crash without your fix). I think something along those lines would work. |
| AppExceptionType::InternalError, | ||
| fmt::format("too many connections to '{}'", upstream_host_->address()->asString())), | ||
| AppException(AppExceptionType::InternalError, | ||
| fmt::format("thrift upstream request: too many connections")), |
There was a problem hiding this comment.
Two things:
- Since you've removed the string interpolation, you don't need to call
fmt::formathere. - There is another case below that uses
upstream_host_below. That should be modified as well.
There was a problem hiding this comment.
Do you mean the Timeout case? When it hits that case ,as far as I can see the upstream_host_ will keep the real one.Could you please check if it's right?
There was a problem hiding this comment.
I don't think we should depend on that behavior even if it's true.
There was a problem hiding this comment.
I think is better to report more detail about the Exception.How about change like this
`
parent_.callbacks_->sendLocalReply(
AppException(
AppExceptionType::InternalError,
fmt::format("connection failure '{}'", (upstream_host_ != nullptr)
? upstream_host_->address()->asString()
: "to upstream")),
true);
`
|
@rgs1 What about I change the test/mocks/tcp/mocks.cc the poolFailure function like this?It will trigger the crash. |
That probably works too. |
Signed-off-by: Guang Yang <pyrl247@gmail.com>
Signed-off-by: Guang Yang <pyrl247@gmail.com>
| })); | ||
| context_.cluster_manager_.tcp_conn_pool_.poolFailure(ConnectionPool::PoolFailureReason::Overflow); | ||
| context_.cluster_manager_.tcp_conn_pool_.poolFailureWithNullHost( | ||
| ConnectionPool::PoolFailureReason::Overflow); |
There was a problem hiding this comment.
verified that this crashes without your fix?
There was a problem hiding this comment.
yes,if i use the previous code ,it will crash in my ci like this .
[ RUN ] ThriftRouterTest.PoolOverflowFailure
TestRandomGenerator running with seed -443243904
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 0/1.16.0-dev/test/RELEASE/BoringSSL
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #0: __restore_rt [0x7faf6ee3a8a0]
[2020-09-09 13:03:11.975][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: Envoy::Tcp::ConnectionPool::MockInstance::poolFailure() [0x9b3dd3]
[2020-09-09 13:03:11.981][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Extensions::NetworkFilters::ThriftProxy::Router::ThriftRouterTest_PoolOverflowFailure_Test::TestBody() [0x788b44]
[2020-09-09 13:03:11.987][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: testing::internal::HandleExceptionsInMethodIfSupported<>() [0x1293a78]
[2020-09-09 13:03:11.992][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: testing::Test::Run() [0x12939a5]
[2020-09-09 13:03:11.998][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: testing::TestInfo::Run() [0x12947f0]
[2020-09-09 13:03:12.003][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: testing::TestSuite::Run() [0x1295177]
[2020-09-09 13:03:12.009][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: testing::internal::UnitTestImpl::RunAllTests() [0x12a2047]
[2020-09-09 13:03:12.015][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: testing::internal::HandleExceptionsInMethodIfSupported<>() [0x12a1968]
[2020-09-09 13:03:12.020][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: testing::UnitTest::Run() [0x12a17ef]
[2020-09-09 13:03:12.026][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::TestRunner::RunTests() [0xbf1734]
[2020-09-09 13:03:12.031][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: main [0xbf092b]
[2020-09-09 13:03:12.032][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #12: __libc_start_main [0x7faf6ea58b97]
================================================================================
INFO: Elapsed time: 1017.162s, Critical Path: 329.15s
test/mocks/tcp/mocks.h
Outdated
| @@ -59,6 +59,7 @@ class MockInstance : public Instance { | |||
|
|
|||
| Envoy::ConnectionPool::MockCancellable* newConnectionImpl(Callbacks& cb); | |||
| void poolFailure(PoolFailureReason reason); | |||
There was a problem hiding this comment.
you can probably just add a bool host_null param and reuse poolFailure()
rgs1
left a comment
There was a problem hiding this comment.
we probably want a changelog fix, e.g.:
Signed-off-by: Guang Yang <pyrl247@gmail.com>
rgs1
left a comment
There was a problem hiding this comment.
Looks great, thanks. Just one last nit.
Signed-off-by: Guang Yang <pyrl247@gmail.com>
zuercher
left a comment
There was a problem hiding this comment.
Thanks. I went ahead and fix the merge conflict on the docs.
try to fix thrift request overflow crash.When request overflow ,the OnPoolFailure will set host to null,cause the crash in this case.
Signed-off-by: Guang Yang pyrl247@gmail.com
For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md
Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]