-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Bug Report
Describe the bug
A core dump is generated every time the fluent-bit pod is restarted/shutdown, after we migrate to fluent-bit 1.4.2.
To Reproduce
- Example log message if applicable:
{"log":"Generated core dump file: /var/lib/systemd/coredump/core.fluent-bit.1003060000.457ffee6fdf34f108294e8ab87adc013.74848.1587581403000000.xz","stream":"stdout","time":"2020-04-22T18:50:07.407147Z"}
- Steps to reproduce the problem:
- Need to have some sort of core-dump monitoring to monitor the core dump generated by killing a pod. In our case, we have setup the linux to call systemd coredump every time a container process is cored so that we won't lose core from pod/container restart.
- Take the fluent-bit 1.4.2 official docker image as base image, use in_tail as input plugin, use a customer output plugin written in Golang using fluent-bit-go. Create a docker image and deployed in a kubenetes/openshift cluster. After the pod is up, simply do a
kubectl delete pod <fluent-bit-pod>
Expected behavior
Pod restart/shutdown properly without core dump generated.
Screenshots
Your Environment
- Version used: fluent-bit 1.4.2
- Configuration:
[SERVICE]
Flush 1
Daemon off
Log_Level debug
[INPUT]
Name tail
Path /etc/fluent-bit/fdf/fluent-bit_fdf*
Refresh_Interval 2
[OUTPUT]
Name fdf_prom_plugin
Match *
where fdf_prom_plugin is written using fluent-bit-go
- Environment name and version (e.g. Kubernetes? What version?):OpenShift 3.7/3.11(Kubernetes 1.7/1.11)
- Server type and version:
- Operating System and version:Red Hat Enterprise Linux 7.4 and above
- Filters and plugins: see above
Additional context
I have the stacktrace of the core dump, it seems like that the signal handling has some issue or bug. Here is the stack trace:
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /fluent-bit/bin/fluent-bit...done.
[New LWP 1]
[New LWP 10]
[New LWP 12]
[New LWP 7]
[New LWP 8]
[New LWP 9]
[New LWP 11]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by /fluent-bit/bin/fluent-bit -c /etc/fluent-bit/config.flb -e /etc/plugin/out_plu'. Program terminated with signal 11, Segmentation fault. #0 0x00007f6c75ef2a17 in abort () from /lib64/libc.so.6 warning: Missing auto-load scripts referenced in section .debug_gdb_scripts of file /etc/plugin/out_plugin.so Use info auto-load python [REGEXP]' to list them.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7.x86_64 libgcc-4.8.5-11.el7.x86_64
(gdb) bt
#0 0x00007f6c75ef2a17 in abort () from /lib64/libc.so.6
#1 0x000000000042b374 in flb_signal_handler ()
#2 0x00007f6c752bd81d in runtime.sigfwd () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:286
#3 0x00007fffe7de3cd8 in ?? ()
#4 0x00007f6c752a47fb in runtime.sigfwdgo (sig=11, info=0x7fffe7de3f30, ctx=0x7fffe7de3e00, ~r3=false) at /usr/lib/golang/src/runtime/signal_unix.go:630
#5 0x00007f6c752a3b1b in runtime.sigtrampgo (sig=11, info=0x7fffe7de3f30, ctx=0x7fffe7de3e00) at /usr/lib/golang/src/runtime/signal_unix.go:272
#6 0x00007f6c752bd873 in runtime.sigtramp () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:306
#7
#8 0x0000000000000000 in ?? ()
#9 0x00000000004578af in flb_proxy_cb_exit ()
#10 0x000000000043cee4 in flb_output_exit ()
#11 0x00000000004476e0 in flb_engine_shutdown ()
#12 0x0000000000447532 in flb_engine_start ()
#13 0x000000000042c4ee in main ()