Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: cgo stuck because of go signal handler went into dead loop #56649

Closed
cocktail828 opened this issue Nov 8, 2022 · 11 comments
Closed
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@cocktail828
Copy link

cocktail828 commented Nov 8, 2022

What version of Go are you using (go version)?

$ go version
go1.19

Does this issue reproduce with the latest release?

Yes. It can be reproduce on go version 1.19

What operating system and processor architecture are you using (go env)?

containered CentOS 7

go env Output
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go:/data/home/xqzhu11/go"
GOPRIVATE=""
GOPROXY="https://goproxy.cn,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1022167488=/tmp/go-build -gno-record-gcc-switches"

The following is what I see.
Backtrace with gdb:

Thread 271 (Thread 0x7f634dfff700 (LWP 223)):
#0  runtime.usleep () at /usr/local/go/src/runtime/sys_linux_amd64.s:140
#1  0x0000000000452c66 in runtime.raisebadsignal (sig=11, c=0x7f634df8e290) at /usr/local/go/src/runtime/signal_unix.go:939
#2  0x00000000004532ec in runtime.badsignal (sig=11, c=0x7f634df8e290) at /usr/local/go/src/runtime/signal_unix.go:1054
#3  0x0000000000451885 in runtime.sigtrampgo (sig=11, info=0x7f634df8e430, ctx=0x7f634df8e300) at /usr/local/go/src/runtime/signal_unix.go:461
#4  0x0000000000471e46 in runtime.sigtramp () at /usr/local/go/src/runtime/sys_linux_amd64.s:359
#5  <signal handler called>
#6  0x00007f695e284f78 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /lib64/libstdc++.so.6
#7  0x00007f69152ae7bf in FANetBuilder::BuildWordLevelNetNew (this=this@entry=0x7f59e6295440, pWordsID=pWordsID@entry=0x7f59c40640cc, wordCount=6, pFANodeWordArr=0x7f59e62954a8, nWordNodeCount=nWordNodeCount@entry=@0x7f634df8e9a8: 0) at FANetBuilder.cpp:169
...

Strace syscalls: As you can see, SIGSEGV is raised again and again...

[pid 108096] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x51ebcdf0} ---
[pid 108096] rt_sigprocmask(SIG_SETMASK, NULL, ~[KILL STOP], 8) = 0
[pid 108096] rt_sigprocmask(SIG_SETMASK, ~[], NULL, 8) = 0
[pid 108096] sigaltstack(NULL, {ss_sp=0, ss_flags=SS_DISABLE, ss_size=0}) = 0
[pid 108096] sigaltstack({ss_sp=0xc0272c0000, ss_flags=0, ss_size=32768}, NULL) = 0
[pid 108096] rt_sigprocmask(SIG_SETMASK, ~[HUP INT QUIT ILL TRAP ABRT BUS FPE KILL SEGV TERM STKFLT CHLD STOP URG PROF SYS RTMIN RT_1 RT_2], NULL, 8) = 0
[pid 108096] gettid()                   = 223
[pid 108096] rt_sigprocmask(SIG_UNBLOCK, [SEGV], NULL, 8) = 0
[pid 108096] rt_sigaction(SIGSEGV, {SIG_DFL, ~[], SA_RESTORER|SA_STACK|SA_RESTART|SA_SIGINFO, 0x7f695ef1d630}, NULL, 8) = 0
[pid 108096] getpid()                   = 1
[pid 108096] gettid()                   = 223
[pid 108096] tgkill(1, 223, SIGSEGV)    = 0
[pid 108096] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=1, si_uid=0} ---
[pid 108096] nanosleep({0, 1000000}, NULL) = 0
[pid 108096] rt_sigaction(SIGSEGV, {0x471e80, ~[], SA_RESTORER|SA_STACK|SA_RESTART|SA_SIGINFO, 0x7f695ef1d630}, NULL, 8) = 0
[pid 108096] rt_sigprocmask(SIG_SETMASK, ~[], NULL, 8) = 0
[pid 108096] sigaltstack({ss_sp=0, ss_flags=SS_DISABLE, ss_size=0}, NULL) = 0
[pid 108096] rt_sigprocmask(SIG_SETMASK, ~[KILL STOP], NULL, 8) = 0
[pid 108096] rt_sigreturn()             = 0
[pid 108096] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x51ebcdf0} ---
[pid 108096] rt_sigprocmask(SIG_SETMASK, NULL, ~[KILL STOP], 8) = 0
[pid 108096] rt_sigprocmask(SIG_SETMASK, ~[], NULL, 8) = 0
[pid 108096] sigaltstack(NULL, {ss_sp=0, ss_flags=SS_DISABLE, ss_size=0}) = 0
[pid 108096] sigaltstack({ss_sp=0xc0272c0000, ss_flags=0, ss_size=32768}, NULL) = 0
[pid 108096] rt_sigprocmask(SIG_SETMASK, ~[HUP INT QUIT ILL TRAP ABRT BUS FPE KILL SEGV TERM STKFLT CHLD STOP URG PROF SYS RTMIN RT_1 RT_2], NULL, 8) = 0
[pid 108096] gettid()                   = 223
[pid 108096] rt_sigprocmask(SIG_UNBLOCK, [SEGV], NULL, 8) = 0
[pid 108096] rt_sigaction(SIGSEGV, {SIG_DFL, ~[], SA_RESTORER|SA_STACK|SA_RESTART|SA_SIGINFO, 0x7f695ef1d630}, NULL, 8) = 0
[pid 108096] getpid()                   = 1
[pid 108096] gettid()                   = 223
[pid 108096] tgkill(1, 223, SIGSEGV)    = 0
[pid 108096] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=1, si_uid=0} ---
...

What did you expect to see?

Panic.

What did you see instead?

It stucked, no crash or panic.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 8, 2022
@cocktail828
Copy link
Author

@ianlancetaylor can you help me, thanks very much.

@tarikkilic
Copy link

We faced of the same issue. But when go test -msan command run, and error occured like this:

gcc: error: unrecognized argument to -fsanitize= option: 'memory'

It happens sporadically.

@cocktail828
Copy link
Author

@tarikkilic GCC does not support -fsanitize=memory. Try use clang instead.

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 8, 2022
@mknyszek mknyszek added this to the Backlog milestone Nov 8, 2022
@mknyszek
Copy link
Contributor

mknyszek commented Nov 8, 2022

Am I understanding correctly that the issue is occurring in Go 1.10.3? That version has not been supported for a very long time. Can you reproduce the issue with the latest release of Go?

CC @golang/runtime

@cocktail828
Copy link
Author

Yep. The issue occurs on product enviroment. We are trying recompile the docker images but it may take several days to update all services nodes and reproduce(if not fixed) the issue.
Thanks for your reply.

@cocktail828 cocktail828 changed the title runtime: cgo stucked because of go signal handler went into dead loop runtime: cgo stuck because of go signal handler went into dead loop Nov 8, 2022
@joedian
Copy link

joedian commented Nov 15, 2022

@cocktail828 can you confirm if the issue is still happening after recompiling?

@cocktail828
Copy link
Author

@cocktail828 can you confirm if the issue is still happening after recompiling?

Sorry for late response. It does not happened for now. I will keep trace the issue. If it does not hanppen for 2 weeks I will close the issue myself.

@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Dec 17, 2022
@cocktail828
Copy link
Author

@joedian Sorry to interrupt you. The issue reproduce again(go 1.19) after big traffic. I have update this issue with the latest info.

@ianlancetaylor ianlancetaylor removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 21, 2022
@mknyszek
Copy link
Contributor

Great! Thanks for reproducing. Do you have a way for us to reproduce this ourselves? That would be a big help in moving forward on this, since otherwise I don't think we see (in triage) the root cause here.

@mknyszek mknyszek added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 21, 2022
@cocktail828
Copy link
Author

I will try to simplify the reproduction conditions. But before that, I have no short way.

@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Dec 26, 2022
@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

7 participants