Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segment fault when pop_client #34

Open
zkmn73 opened this issue Feb 21, 2017 · 1 comment
Open

segment fault when pop_client #34

zkmn73 opened this issue Feb 21, 2017 · 1 comment

Comments

@zkmn73
Copy link
Contributor

zkmn73 commented Feb 21, 2017

在线上使用redis-cerberus的时候发现cerberus有的时候会有小概率挂掉的情况。我们排查了一下,挂掉的原因是一个segment fault

segmentfault的日志:

2017-02-19 18:41:31,669 I 140625852184320 Slot map updated
Segmentation fault:
/opt/tiger/twemproxy_bin/bin/cerberus 45f76c trac::stacktrace() 29
/opt/tiger/twemproxy_bin/bin/cerberus 45f997 trac::print_trace(std::ostream&) 19
/opt/tiger/twemproxy_bin/bin/cerberus 462dc9  0
/lib/x86_64-linux-gnu/libpthread.so.0 aa4368d0  f8d0
/opt/tiger/twemproxy_bin/bin/cerberus 454355 cerb::Server::pop_client(cerb::Client*) 165
/opt/tiger/twemproxy_bin/bin/cerberus 43a41c cerb::Client::~Client() 2c
/opt/tiger/twemproxy_bin/bin/cerberus 43a58d cerb::Client::after_events(std::set<cerb::Connection*, std::less<cerb::Connection*>, std::allocator<cerb::Connection*> >&) 2d
/opt/tiger/twemproxy_bin/bin/cerberus 44e3ff cerb::Proxy::handle_events(epoll_event*, int) 30f
/opt/tiger/twemproxy_bin/bin/cerberus 446a62  0
/opt/tiger/twemproxy_bin/bin/cerberus 446f5d  0
/usr/lib/x86_64-linux-gnu/libstdc++.so.6 aa1d2970  b6970
/lib/x86_64-linux-gnu/libpthread.so.0 aa42f0a4  80a4
/lib/x86_64-linux-gnu/libc.so.6 a994287d clone 6d
terminate called without an active exception
Cerberus version 0.7.9-2016-08-18 Copyright (c) HunanTV Platform developers

在日志中能看到是在pop_client的时候挂掉的。
因为线上挂掉的概率小,又不能加-g参数,所以加了日志:

void Server::pop_client(Client* cli)
{
    LOG(INFO) << "Server::pop_client, before erase_if";
    util::erase_if(
        this->_commands,
        [&](util::sref<DataCommand> cmd)
        {
            return cmd->group->client.is(cli);
        });
    LOG(INFO) << "Server::pop_client, after erase_if";
    for (util::sref<DataCommand>& cmd: this->_sent_commands) {
        LOG(INFO) << "Server::pop_client, in for";
        if(cmd.not_nul()){
            LOG(INFO) << "Server::pop_client, cmd->group:" << cmd->group.nul();
            LOG(INFO) << "Server::pop_client, cmd->group->client:" << cmd->group->client.nul();
        }
        if (cmd.not_nul() && cmd->group->client.is(cli)) {
            LOG(INFO) << "Server::pop_client, in if";
            cmd.reset();
        }
    }
}

如下:

2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, before erase_if
2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, after erase_if
2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, in for
2017-02-21 05:39:22,306 I 140140096595712 Server::pop_client, cmd->group:0
Segmentation fault:
/opt/tiger/twemproxy_bin/bin/cerberus 4600dc trac::stacktrace() 29
/opt/tiger/twemproxy_bin/bin/cerberus 460307 trac::print_trace(std::ostream&) 19
/opt/tiger/twemproxy_bin/bin/cerberus 463739  0
/lib/x86_64-linux-gnu/libpthread.so.0 eaeb28d0  f8d0
/opt/tiger/twemproxy_bin/bin/cerberus 456069 cerb::Server::pop_client(cerb::Client*) 4c9
/opt/tiger/twemproxy_bin/bin/cerberus 43a47c cerb::Client::~Client() 2c
/opt/tiger/twemproxy_bin/bin/cerberus 43a5ed cerb::Client::after_events(std::set<cerb::Connection*, std::less<cerb::Connecti
on*>, std::allocator<cerb::Connection*> >&) 2d
/opt/tiger/twemproxy_bin/bin/cerberus 44e45f cerb::Proxy::handle_events(epoll_event*, int) 30f
/opt/tiger/twemproxy_bin/bin/cerberus 446ac2  0
/opt/tiger/twemproxy_bin/bin/cerberus 446fbd  0
/usr/lib/x86_64-linux-gnu/libstdc++.so.6 eac4e970  b6970
/lib/x86_64-linux-gnu/libpthread.so.0 eaeab0a4  80a4
/lib/x86_64-linux-gnu/libc.so.6 ea3be87d clone 6d
terminate called without an active exception
Cerberus version 0.7.9-2016-08-18 Copyright (c) HunanTV Platform developers

打印cmd->group->client的地址:

if(cmd.not_nul()){
    LOG(INFO) << "Server::pop_client, cmd->group:" << cmd->group.nul();
    try {
        LOG(INFO) << "Server::pop_client, cmd->group->client.address:" << &(cmd->group->client);
    } catch (std::exception& e) {
        LOG(INFO) << "Server::pop_client, exception:" << e.what();
    }
}
2017-02-24 05:48:22,789 I 140241907771136 Server::pop_client, in for
2017-02-24 05:48:22,789 I 140241907771136 Server::pop_client, cmd->group:1
2017-02-24 05:48:22,789 I 140241907771136 Server::pop_client, cmd->group->client.address:0x8
Segmentation fault:
/opt/tiger/twemproxy_bin/bin/cerberus 46012c trac::stacktrace() 29
/opt/tiger/twemproxy_bin/bin/cerberus 460357 trac::print_trace(std::ostream&) 19

所以就是
cmd->group有的时候是null,有的时候不是null
当时有可能cmd->group.nul()是false,但是到了cmd->group->client却是nul就崩了

说明,有的时候sref这个结构体是NULL,有的时候sref里面的ptr是NULL,导致segment fault

@zkmn73
Copy link
Contributor Author

zkmn73 commented Feb 24, 2017

#35
这个pull request只能解决部分的segment fault.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant