Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jaeger is OOMKilled when use badger as storage #2987

Closed
Sallyan opened this issue May 11, 2021 · 7 comments
Closed

Jaeger is OOMKilled when use badger as storage #2987

Sallyan opened this issue May 11, 2021 · 7 comments
Labels
area/storage storage/badger Issues related to badger storage

Comments

@Sallyan
Copy link

Sallyan commented May 11, 2021

version: all-in-one:1.18.1
issue:
when i use badge as storage, Jaeger requests much more memory than in-memory storage and it keeps OOMKiller.
At the beginning Jaeger requests just hundred MB memory when it uses in-memory storage. But it requests over 5Gi memory when changed to Badge.
Jaeger yaml:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: tracing-jaeger
  namespace: kyma-system
spec:
  agent:
    config: {}
    options: {}
    resources: {}
  allInOne:
    config: {}
    image: eu.gcr.io/kyma-project/external/jaegertracing/all-in-one:1.18.1
    options:
      log-level: info
    resources: {}
  annotations:
    sidecar.istio.io/inject: "true"
    sidecar.istio.io/rewriteAppHTTPProbers: "true"
  collector:
    config: {}
    options: {}
    resources: {}
  ingester:
    config: {}
    options: {}
    resources: {}
  ingress:
    enabled: false
    openshift: {}
    options: {}
    resources: {}
    security: none
  query:
    options: {}
    resources: {}
  resources:
    limits:
      cpu: 500m
      memory: 8Gi
    requests:
      cpu: 200m
      memory: 6Gi
  sampling:
    options: {}
  storage:
    cassandraCreateSchema: {}
    dependencies:
      resources: {}
      schedule: 55 23 * * *
    elasticsearch:
      nodeCount: 3
      redundancyPolicy: SingleRedundancy
      resources:
        limits:
          memory: 16Gi
        requests:
          cpu: "1"
          memory: 16Gi
      storage: {}
    esIndexCleaner:
      numberOfDays: 7
      resources: {}
      schedule: 55 23 * * *
    esRollover:
      resources: {}
      schedule: 0 0 * * *
    options:
      badger:
        directory-key: /badger/key
        directory-value: /badger/data
        ephemeral: false
        span-store-ttl: 24h
        truncate: true
      cassandra:
        keyspace: jaeger_v1_datacenter3
        servers: cassandra.default.svc
      es:
        server-urls: http://elasticsearch-client.default.svc:9200
      memory:
        max-traces: 10000
    type: badger
  strategy: allinone
  ui:
    options:
      dependencies:
        menuEnabled: true
      menu:
      - items:
        - label: Documentation
          url: https://www.jaegertracing.io/docs/latest
        label: About Jaeger
      - items:
        - label: Documentation
          url: https://kyma-project.io/docs/components/tracing/
        label: About Kyma
  volumeMounts:
  - mountPath: /badger
    name: data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: jaeger-pvc
@Sallyan
Copy link
Author

Sallyan commented May 11, 2021

Screen Shot 2021-05-11 at 4 02 42 PM

The memory usage increased from MB to Gi when changes from in-memory storage to badge.

@jpkrohling
Copy link
Contributor

I think this isn't related to the operator, but more general to Jaeger and its usage of Badger.

@jpkrohling jpkrohling transferred this issue from jaegertracing/jaeger-operator May 11, 2021
@jpkrohling
Copy link
Contributor

We have a few changes in the queue for badger as storage for Jaeger, but it might take a while for us to work on it. If this is critical to you, I can point you to the places in the code for you to take a look at.

@jpkrohling jpkrohling added area/storage storage/badger Issues related to badger storage labels May 11, 2021
@Sallyan
Copy link
Author

Sallyan commented May 12, 2021

we really want to have persistent data for the tracing.
Do you have any suggestion or kind of wordaround for Badger?
Yeah, please also point me the code. Thanks!

@jpkrohling
Copy link
Contributor

we really want to have persistent data for the tracing.

You can use Cassandra or Elasticsearch for that, which are actually recommended if you need to scale your deployment...

In any case, here's the badger code: https://github.com/jaegertracing/jaeger/tree/master/plugin/storage/badger

@jpkrohling
Copy link
Contributor

This should have been fixed by #3096. If you are still experiencing this, feel free to reopen.

@rmannibucau
Copy link

rmannibucau commented Oct 25, 2023

I still have this behavior with all in one image v1.49, weird thing is the OOM happens at startup with no load at all (probably when badger is getting reopened).

Edit: it seems /keys are not evicted whereas /data is almost empty:

12G	/opt/jaeger-badger/key
16K	/opt/jaeger-badger/data

Mystery solved: spring-boot with hikari+sleuth will trigger spans even for connection.isValid() which lead to filling badger too quickly if timeout and pool size are high enough, sorry for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage storage/badger Issues related to badger storage
Projects
None yet
Development

No branches or pull requests

3 participants