Skip to content

Fluentbit TLS SIGSEGV error #1029

@dtwork2022

Description

@dtwork2022

Bugs should be filed for issues encountered whilst operating logging-operator.
You should first attempt to resolve your issues through the community support
channels, e.g. Slack, in order to rule out individual configuration errors. #logging-operator
Please provide as much detail as possible.

Describe the bug:
When tls is enabled for Fluentbit using latest banzaicloud chart of 3.17.6 I'm getting CrashLoopBackOff on Fluentbit pods with log errors

[2022/05/31 17:40:06] [engine] caught signal (SIGSEGV)
[2022/05/31 17:40:06] [engine] caught signal (SIGSEGV)
#0  0x55d4a79abaa0      in  flb_tls_session_create() at src/tls/flb_tls.c:334
#1  0x55d4a79abaa0      in  flb_tls_session_create() at src/tls/flb_tls.c:334
#2  0x55d4a79b7699      in  flb_io_net_connect() at src/flb_io.c:109
#3  0x55d4a79b7699      in  flb_io_net_connect() at src/flb_io.c:109
#4  0x55d4a7995d13      in  create_conn() at src/flb_upstream.c:560
#5  0x55d4a799620f      in  flb_upstream_conn_get() at src/flb_upstream.c:705
#6  0x55d4a7995d13      in  create_conn() at src/flb_upstream.c:560
#7  0x55d4a799620f      in  flb_upstream_conn_get() at src/flb_upstream.c:705
#8  0x55d4a7a29c71      in  cb_forward_flush() at plugins/out_forward/forward.c:1182
#9  0x55d4a7a29c71      in  cb_forward_flush() at plugins/out_forward/forward.c:1182
#10 0x55d4a797fbe8      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:517
#10 0x55d4a797fbe8      in  output_pre_cb_flush() at include/fluent-bit/flb_output.h:517
#12 0x55d4a7ec0d66      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#13 0xffffffffffffffff  in  ???() at ???:0
#14 0x55d4a7ec0d66      in  co_init() at lib/monkey/deps/flb_libco/amd64.c:117
#15 0xffffffffffffffff  in  ???() at ???:0

Expected behaviour:
TLS should work if enabled and not crash

Steps to reproduce the bug:
Deploy 3.17.6 banzaicloud logging-operator-logging with tls enabled, in this config I disabled tls verify

fluentd:
  fluentLogDestination: "forward\n  <server>\n  name localhost\n  host 127.0.0.1\n  port 24240\n</server>"
  scaling:
    replicas: 3
  resources:
    limits:
      cpu: 2000m
      memory: 2000Mi
    requests:
      cpu: 2000m
      memory: 2000Mi
  metrics:
    serviceMonitor: true
fluentbit:
  inputTail:
    Mem_Buf_Limit: "10MB"
  filterKubernetes:
    Cache_Use_Docker_Id: "On"
    Buffer_Size: "64k"
  enableUpstream: true
  positiondb:
    hostPath:
      path: /var/lib/fluent-bit
  resources:
    limits:
      cpu: 400m
      memory: 500Mi
    requests:
      cpu: 400m
      memory: 500Mi
  metrics:
    serviceMonitor: true
disablePvc: false
enableHostPath: false
tls:
  enabled: true
  verify: false

Additional context:
Add any other context about the problem here.

Environment details:

  • Kubernetes version (e.g. v1.15.2): 1.18.6
  • Cloud-provider/provisioner (e.g. AKS, GKE, EKS, PKE etc): Rancher / RKE On-premise
  • logging-operator version (e.g. 2.1.1): 3.17.6
  • Install method (e.g. helm or static manifests): helm
  • Logs from the misbehaving component (and any other relevant logs): pasted above in problem
  • Resource definition (possibly in YAML format) that caused the issue, without sensitive data: pasted above in reproduction section

/kind bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions