We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在线上使用redis-cerberus的时候发现cerberus有的时候会有小概率挂掉的情况。我们排查了一下,挂掉的原因是一个segment fault
segmentfault的日志:
2017-02-19 18:41:31,669 I 140625852184320 Slot map updated Segmentation fault: /opt/tiger/twemproxy_bin/bin/cerberus 45f76c trac::stacktrace() 29 /opt/tiger/twemproxy_bin/bin/cerberus 45f997 trac::print_trace(std::ostream&) 19 /opt/tiger/twemproxy_bin/bin/cerberus 462dc9 0 /lib/x86_64-linux-gnu/libpthread.so.0 aa4368d0 f8d0 /opt/tiger/twemproxy_bin/bin/cerberus 454355 cerb::Server::pop_client(cerb::Client*) 165 /opt/tiger/twemproxy_bin/bin/cerberus 43a41c cerb::Client::~Client() 2c /opt/tiger/twemproxy_bin/bin/cerberus 43a58d cerb::Client::after_events(std::set<cerb::Connection*, std::less<cerb::Connection*>, std::allocator<cerb::Connection*> >&) 2d /opt/tiger/twemproxy_bin/bin/cerberus 44e3ff cerb::Proxy::handle_events(epoll_event*, int) 30f /opt/tiger/twemproxy_bin/bin/cerberus 446a62 0 /opt/tiger/twemproxy_bin/bin/cerberus 446f5d 0 /usr/lib/x86_64-linux-gnu/libstdc++.so.6 aa1d2970 b6970 /lib/x86_64-linux-gnu/libpthread.so.0 aa42f0a4 80a4 /lib/x86_64-linux-gnu/libc.so.6 a994287d clone 6d terminate called without an active exception Cerberus version 0.7.9-2016-08-18 Copyright (c) HunanTV Platform developers
在日志中能看到是在pop_client的时候挂掉的。 因为线上挂掉的概率小,又不能加-g参数,所以加了日志:
void Server::pop_client(Client* cli) { LOG(INFO) << "Server::pop_client, before erase_if"; util::erase_if( this->_commands, [&](util::sref<DataCommand> cmd) { return cmd->group->client.is(cli); }); LOG(INFO) << "Server::pop_client, after erase_if"; for (util::sref<DataCommand>& cmd: this->_sent_commands) { LOG(INFO) << "Server::pop_client, in for"; if(cmd.not_nul()){ LOG(INFO) << "Server::pop_client, cmd->group:" << cmd->group.nul(); LOG(INFO) << "Server::pop_client, cmd->group->client:" << cmd->group->client.nul(); } if (cmd.not_nul() && cmd->group->client.is(cli)) { LOG(INFO) << "Server::pop_client, in if"; cmd.reset(); } } }
如下:
2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, before erase_if 2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, after erase_if 2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, in for 2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, cmd->group:0 Segmentation fault: /opt/tiger/twemproxy_bin/bin/cerberus 4600dc trac::stacktrace() 29 /opt/tiger/twemproxy_bin/bin/cerberus 460307 trac::print_trace(std::ostream&) 19 /opt/tiger/twemproxy_bin/bin/cerberus 463739 0 /lib/x86_64-linux-gnu/libpthread.so.0 eaeb28d0 f8d0 /opt/tiger/twemproxy_bin/bin/cerberus 456069 cerb::Server::pop_client(cerb::Client*) 4c9 /opt/tiger/twemproxy_bin/bin/cerberus 43a47c cerb::Client::~Client() 2c /opt/tiger/twemproxy_bin/bin/cerberus 43a5ed cerb::Client::after_events(std::set<cerb::Connection*, std::less<cerb::Connecti on*>, std::allocator<cerb::Connection*> >&) 2d /opt/tiger/twemproxy_bin/bin/cerberus 44e45f cerb::Proxy::handle_events(epoll_event*, int) 30f /opt/tiger/twemproxy_bin/bin/cerberus 446ac2 0 /opt/tiger/twemproxy_bin/bin/cerberus 446fbd 0 /usr/lib/x86_64-linux-gnu/libstdc++.so.6 eac4e970 b6970 /lib/x86_64-linux-gnu/libpthread.so.0 eaeab0a4 80a4 /lib/x86_64-linux-gnu/libc.so.6 ea3be87d clone 6d terminate called without an active exception Cerberus version 0.7.9-2016-08-18 Copyright (c) HunanTV Platform developers
打印cmd->group->client的地址:
if(cmd.not_nul()){ LOG(INFO) << "Server::pop_client, cmd->group:" << cmd->group.nul(); try { LOG(INFO) << "Server::pop_client, cmd->group->client.address:" << &(cmd->group->client); } catch (std::exception& e) { LOG(INFO) << "Server::pop_client, exception:" << e.what(); } }
2017-02-24 05:48:22,789 I 140241907771136 Server::pop_client, in for 2017-02-24 05:48:22,789 I 140241907771136 Server::pop_client, cmd->group:1 2017-02-24 05:48:22,789 I 140241907771136 Server::pop_client, cmd->group->client.address:0x8 Segmentation fault: /opt/tiger/twemproxy_bin/bin/cerberus 46012c trac::stacktrace() 29 /opt/tiger/twemproxy_bin/bin/cerberus 460357 trac::print_trace(std::ostream&) 19
所以就是 cmd->group有的时候是null,有的时候不是null 当时有可能cmd->group.nul()是false,但是到了cmd->group->client却是nul就崩了
说明,有的时候sref这个结构体是NULL,有的时候sref里面的ptr是NULL,导致segment fault
The text was updated successfully, but these errors were encountered:
#35 这个pull request只能解决部分的segment fault.
Sorry, something went wrong.
No branches or pull requests
在线上使用redis-cerberus的时候发现cerberus有的时候会有小概率挂掉的情况。我们排查了一下,挂掉的原因是一个segment fault
segmentfault的日志:
在日志中能看到是在pop_client的时候挂掉的。
因为线上挂掉的概率小,又不能加-g参数,所以加了日志:
如下:
打印cmd->group->client的地址:
所以就是
cmd->group有的时候是null,有的时候不是null
当时有可能cmd->group.nul()是false,但是到了cmd->group->client却是nul就崩了
说明,有的时候sref这个结构体是NULL,有的时候sref里面的ptr是NULL,导致segment fault
The text was updated successfully, but these errors were encountered: