Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent Queue Size is being limited #30770

Closed
JustinMason opened this issue Jan 25, 2024 · 1 comment
Closed

Persistent Queue Size is being limited #30770

JustinMason opened this issue Jan 25, 2024 · 1 comment
Labels
bug Something isn't working needs triage New item requiring triage

Comments

@JustinMason
Copy link

JustinMason commented Jan 25, 2024

Component(s)

ExporterHelper
FileStorage
ClickHouseExporter

What happened?

Description

I am trying to test the file storage configuration for the ClickHouse exporter.
No matter what queue size I put and how much PVC I allocate I get the error around ~1500 batches:
error exporterhelper/queue_sender.go:213 Dropping data because sending_queue is full. Try increasing queue_size. {"kind": "exporter", "data_type": "metrics", "name": "clickhouse", "dropped_items": 10000}

Steps to Reproduce

Configure Clickhouse Exporter with file_storage storage.
Set batch and make queue size large, 100k+
Get the pipeline flowing with 10000 metrics a second.
Shutdown ClickHouse
Watch the send queue size grow:
max(otelcol_exporter_queue_size{exporter=~".*",job="otel"}) by (exporter )

Expected Result

Queue size continues to grow util queue_size is reached.

Actual Result

Around 1.5k this error is raised:
batch_processor.go:258 Sender failed {"kind": "processor", "name": "batch", "pipeline": "metrics", "error": "write /etc/otel-collector/buffer/exporter_clickhouse__metrics: no space left on device"}

I have tried increasing Doubling) the PVC and different batch sizes at it always hits this threshold and stops persisting.

Collector version

v0.92.0

Environment information

Environment

GKE OpenTelemetry Operator 0.46.0

OpenTelemetry Collector configuration

receivers:
      otlp:
        protocols:
          grpc:
            endpoint: '0.0.0.0:4317'
            tls:
              cert_file: '/opt/certs/tls.crt'
              key_file: '/opt/certs/tls.key'
              ca_file: '/opt/certs/ca.crt'
    processors:
      batch:
        send_batch_size: 10000
        timeout: '500ms'
      resourcedetection:
        detectors:
          - 'gcp'
        timeout: '10s'
      k8sattributes:
        auth_type: 'serviceAccount'
        extract:
          metadata:
            - 'k8s.namespace.name'
            - 'k8s.pod.name'
            - 'k8s.pod.start_time'
            - 'k8s.pod.uid'
            - 'k8s.deployment.name'
            - 'k8s.node.name'
    exporters:
      clickhouse:
        endpoint: 'my-endpoints'
        database: 'otel'
        username: 'oteladmin'
        password: ''
        ttl: 0
        logs_table_name: 'otel_logs_no_replica'
        traces_table_name: 'otel_traces_no_replica'
        metrics_table_name: 'otel_metrics'
        timeout: '5s'
        sending_queue:
          storage: 'file_storage/otc'
          queue_size: 1000000
        retry_on_failure:
          enabled: true
          initial_interval: '5s'
          max_interval: '120s'
          max_elapsed_time: '0'
      debug:
        verbosity: 'basic'
    extensions:
      health_check: {}
      file_storage/otc:
        directory: '/etc/otel-collector/buffer'
        timeout: '1s'
        compaction:
          on_start: true
          on_rebound: true
          directory: '/tmp/'
        fsync: true
    service:
      extensions:
        - 'file_storage/otc'
      pipelines:
        metrics:
          receivers:
            - 'otlp'
          processors:
            - 'resourcedetection'
            - 'k8sattributes'
            - 'batch'
          exporters:
            - 'clickhouse'
      telemetry:
        metrics:
          address: '0.0.0.0:8888'
          level: 'normal'

Log output

2024-01-25T00:17:08.071Z    warn    [email protected]/batch_processor.go:258    Sender failed    {"kind": "processor", "name": "batch", "pipeline": "metrics", "error": "write /etc/otel-collector/buffer/exporter_clickhouse__metrics: no space left on device"}                                                                                                                                                                                                                                            prom-to-clickhouse-metrics-collector-0 2024-01-25T00:17:08.890Z    error    exporterhelper/queue_sender.go:213    Dropping data because sending_queue is full. Try increasing queue_size.    {"kind": "exporter", "data_type": "metrics", "name": "clickhouse", "dropped_items": 10000}

Additional context

I have tried increasing the PVC and various queue_size and send_batch_size values.
This error doesn't make sense, why would it be limited?

StatefulSet

  volumeMounts:
  - mountPath: /etc/otel-collector/buffer
     name: otel-collector-prom-to-clickhouse-metrics-pvc
  volumes:
  - name: otel-collector-prom-to-clickhouse-metrics-pvc
     persistentVolumeClaim:
         claimName: otel-collector-prom-to-clickhouse-metrics-pvc
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: otel-collector-prom-to-clickhouse-metrics-pvc
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      volumeMode: Filesystem
@JustinMason JustinMason added bug Something isn't working needs triage New item requiring triage labels Jan 25, 2024
@JustinMason
Copy link
Author

Never mind. My PVC wasn't getting increased even though I had updated the Collector's claim. 🤷
Seems to be working, it could be useful to expose free storage bytes in addition to the send queue size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage New item requiring triage
Projects
None yet
Development

No branches or pull requests

1 participant