Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: S3 output configuration isn't getting passed into fluent-bit #1398

Open
mike12806 opened this issue Nov 5, 2024 · 4 comments
Open

bug: S3 output configuration isn't getting passed into fluent-bit #1398

mike12806 opened this issue Nov 5, 2024 · 4 comments

Comments

@mike12806
Copy link

mike12806 commented Nov 5, 2024

Describe the issue

I'm using the fluent-bit operator helm chart and I have 2 outputs enabled, elasticsearch and S3. When fluent-bit pods start (debug logging), I can see es and stderr are configured outputs, but not S3. Unfortunately, the helm chart deployments works fine and I can't find any errors related to the S3 output.

Note: I'm using Minio as a locally hosted (but outside my kubernetes cluster) S3 endpoint. I validated I can hit Minio via S3 compatible client in the same cluster and namespace

To Reproduce

Deploy fluent bit operator helm chart with ES and S3 enabled as outputs

Expected behavior

Both ES and S3 should work as outputs from fluent-bit

Your Environment

- Fluent Operator version: 3.2.0
- Container Runtime: containerd / kubernetes
- Operating system: Linux / Ubuntu
- Kernel version: 6.8.0-48 generic

How did you install fluent operator?

Helm Chart

Additional context

This is my values.yaml configuration

Kubernetes: true
containerRuntime: containerd
fluentbit:
  additionalVolumes: []
  additionalVolumesMounts: []
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: node-role.kubernetes.io/edge
                operator: DoesNotExist
  annotations: {}
  args: []
  command: []
  crdsEnable: true
  disableLogVolumes: false
  enable: true
  envVars: []
  filter:
    containerd:
      enable: true
    kubernetes:
      annotations: false
      enable: true
      labels: false
    multiline:
      emitterMemBufLimit: 120
      enable: false
      keyContent: log
      parsers:
        - go
        - python
        - java
    systemd:
      enable: true
  hostNetwork: false
  image:
    repository: ghcr.io/fluent/fluent-operator/fluent-bit
    tag: v3.1.8
  imagePullSecrets: []
  initContainers: []
  input:
    fluentBitMetrics: {}
    nodeExporterMetrics: {}
    systemd:
      enable: true
      includeKubelet: true
      path: /var/log/journal
      pauseOnChunksOverlimit: 'off'
      storageType: filesystem
      stripUnderscores: 'off'
      systemdFilter:
        enable: true
        filters: []
      LimitNOFILE: '20000'
      bufferMaxSize: 4GB
    tail:
      bufferMaxSize: 4GB
      enable: true
      memBufLimit: 100MB
      path: /var/log/containers/*.log
      pauseOnChunksOverlimit: 'off'
      readFromHead: false
      refreshIntervalSeconds: 30
      skipLongLines: true
      storageType: filesystem
  kubeedge:
    enable: false
    prometheusRemoteWrite:
      host: <cloud-prometheus-service-host>
      port: <cloud-prometheus-service-port>
  labels: {}
  logLevel: ''
  namespaceFluentBitCfgSelector: {}
  nodeSelector: {}
  output:
    es:
      bufferSize: 4GB
      enable: true
      host: elasticsearch.logging.svc.cluster.local
      logstashPrefix: ks-logstash-log
      port: 9200
      traceError: true
      logstashFormat: true
      retry: true
      retryBackoff: 10s
      retryLimit: false
      retryWait: 5
      storageType: filesystem
      suppressTypeName: 'On'
    kafka:
      brokers: <kafka broker list like xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092>
      enable: false
      topics: ks-log
    loki:
      enable: false
      host: 127.0.0.1
      httpPassword: mypass
      httpUser: myuser
      port: 3100
      tenantID: ''
    opensearch: {}
    opentelemetry: {}
    prometheusMetricsExporter: {}
    stackdriver: {}
    stdout:
      enable: true
    s3:
      aws_key_id: 
      aws_secret_key: 
      bucket: logarchive
      compression: gzip
      enable: true
      endpoint: http://192.168.1.85:9000
      force_path_style: true
      path: logs/${TAG}/%Y/%m/%d/
      region: us-east-1
      s3_key_format: ${TAG}/%Y/%m/%d/%H/%M/%S
      tls_verify: false
      total_file_size: 5M
      upload_timeout: 60s
  parsers:
    javaMultiline:
      enable: false
  podSecurityContext: {}
  priorityClassName: ''
  rbacRules: {}
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 10m
      memory: 25Mi
  schedulerName: ''
  secrets: []
  securityContext: {}
  service:
    storage:
      checksum: 'off'
      deleteIrrecoverableChunks: 'on'
      maxChunksUp: 128
      metrics: 'on'
      path: /tmp/fluent/logging/
      sync: normal
  serviceAccountAnnotations: {}
  serviceMonitor:
    enable: false
    interval: 30s
    metricRelabelings: []
    path: /api/v2/metrics/prometheus
    relabelings: []
    scrapeTimeout: 10s
    secure: false
    tlsConfig: {}
    backlogMemLimit: 256MB
  tolerations:
    - operator: Exists
fluentd:
  crdsEnable: true
  enable: false
  envVars: []
  extras: {}
  forward:
    port: 24224
  image:
    repository: ghcr.io/fluent/fluent-operator/fluentd
    tag: v1.17.0
  imagePullSecrets: []
  logLevel: ''
  mode: collector
  name: fluentd
  output:
    es:
      buffer:
        enable: true
        path: /tmp/fluent/logging/es
        type: file
        maxBytes: 4GB
      enable: false
      host: elasticsearch-logging-data.kubesphere-logging-system.svc
      logstashPrefix: ks-logstash-log
      port: 9200
    kafka:
      brokers: >-
        my-cluster-kafka-bootstrap.default.svc:9091,my-cluster-kafka-bootstrap.default.svc:9092,my-cluster-kafka-bootstrap.default.svc:9093                                                        
      buffer:
        enable: true
        path: /tmp/fluent/logging/kafka
        type: file
        maxBytes: 4GB
      enable: false
      topicKey: kubernetes_ns
    opensearch: {}
  podSecurityContext: {}
  port: 24224
  priorityClassName: ''
  replicas: 1
  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 100m
      memory: 128Mi
  schedulerName: ''
  securityContext: {}
  watchedNamespaces:
    - kube-system
    - default
fullnameOverride: ''
nameOverride: ''
namespaceOverride: ''
operator:
  annotations: {}
  container:
    repository: kubesphere/fluent-operator
    tag: v3.2.0
  disableComponentControllers: fluentd
  enable: true
  extraArgs: []
  imagePullSecrets: []
  initcontainer:
    repository: docker
    resources:
      limits:
        cpu: 500m
        memory: 1024Mi
      requests:
        cpu: 250m
        memory: 512Mi
    tag: '20.10'
  labels: {}
  logPath:
    containerd: /var/log
  nodeSelector: {}
  podSecurityContext: {}
  priorityClassName: ''
  resources:
    limits:
      cpu: 500m
      memory: 1024Mi
    requests:
      cpu: 100m
      memory: 512Mi
  securityContext: {}
  tolerations: []

Here are the logs I see on fluent-bit pod startup, indicating S3 isn't being parsed as an output, it seems

Container: fluent-bit
output
Connected
[2024/11/05 17:28:10] [ info] [output:es:es.0] worker #0 started
[2024/11/05 17:28:10] [ info] [output:es:es.0] worker #1 started
[2024/11/05 17:28:10] [ info] [output:stdout:stdout.1] worker #0 started
@cw-Guo
Copy link
Collaborator

cw-Guo commented Nov 5, 2024

Can you check the secret created by the operator? You can get the full fluent-bit configuration file from it.

@mike12806
Copy link
Author

mike12806 commented Nov 6, 2024

@cw-Guo Here's the fluent-bit config secret (base64 decoded). Interestingly, it looks like the S3 output isn't getting passed in. I even tried deleting the secret and re-deploying the helm chart for the operator to see if it would re-generate properly, but no luck.

Guessing something is either busted in Rancher (what I’m using to interact with the held chart via a rancher “app”) or something in the helm template is borked?

[Service]
    Http_Server    true
    Parsers_File    /fluent-bit/etc/parsers.conf
    Parsers_File    /fluent-bit/config/parsers_multiline.conf
    storage.path    /tmp/fluent/logging/
    storage.sync    normal
    storage.checksum    off
    storage.metrics    on
    storage.max_chunks_up    128
    storage.delete_irrecoverable_chunks    on
[Input]
    Name    systemd
    Path    /var/log/journal
    DB    /fluent-bit/tail/systemd.db
    DB.Sync    Normal
    Tag    service.*
    Systemd_Filter    _SYSTEMD_UNIT=containerd.service
    Systemd_Filter    _SYSTEMD_UNIT=kubelet.service
    Strip_Underscores    off
    storage.type    filesystem
    storage.pause_on_chunks_overlimit    off
[Input]
    Name    tail
    Buffer_Max_Size    4GB
    Path    /var/log/containers/*.log
    Read_from_Head    false
    Refresh_Interval    30
    Skip_Long_Lines    true
    DB    /fluent-bit/tail/pos.db
    DB.Sync    Normal
    Mem_Buf_Limit    100MB
    Parser    cri
    Tag    kube.*
    storage.type    filesystem
    storage.pause_on_chunks_overlimit    off
[Filter]
    Name    lua
    Match    kube.*
    script    /fluent-bit/config/containerd.lua
    call    containerd
    time_as_table    true
[Filter]
    Name    kubernetes
    Match    kube.*
    Kube_URL    https://kubernetes.default.svc:443
    Kube_CA_File    /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File    /var/run/secrets/kubernetes.io/serviceaccount/token
    K8S-Logging.Exclude    true
    Labels    false
    Annotations    false
[Filter]
    Name    nest
    Match    kube.*
    Operation    lift
    Nested_under    kubernetes
    Add_prefix    kubernetes_
[Filter]
    Name    modify
    Match    kube.*
    Remove    stream
    Remove    kubernetes_pod_id
    Remove    kubernetes_host
    Remove    kubernetes_container_hash
[Filter]
    Name    nest
    Match    kube.*
    Operation    nest
    Wildcard    kubernetes_*
    Nest_under    kubernetes
    Remove_prefix    kubernetes_
[Filter]
    Name    lua
    Match    service.*
    script    /fluent-bit/config/systemd.lua
    call    add_time
    time_as_table    true
[Output]
    Name    es
    Match_Regex    (?:kube|service)\.(.*)
    Host    elasticsearch.logging.svc.cluster.local
    Port    9200
    Buffer_Size    4GB
    Logstash_Format    true
    Logstash_Prefix    ks-logstash-log
    Time_Key    @timestamp
    Generate_ID    true
    Write_Operation    create
    Replace_Dots    false
    Trace_Error    true
    Suppress_Type_Name    On
[Output]
    Name    stdout
    Match    *

@benjaminhuo
Copy link
Member

cc @wenchajun

@mike12806
Copy link
Author

@wenchajun any insight here? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants