Skip to content

Intermittent logstash pod pipeline stops with [DeadLetterQueueWriter] unable to finalize segment #15078

@ukho-cfreeman

Description

@ukho-cfreeman

Logstash information:

Please include the following information:

  1. Logstash version: logstash 8.7.1
  2. Logstash installation source: official docker image
  3. How is Logstash being run: kubernetes

Plugins installed:

logstash-codec-cef (6.2.6)
logstash-codec-collectd (3.1.0)
logstash-codec-dots (3.0.6)
logstash-codec-edn (3.1.0)
logstash-codec-edn_lines (3.1.0)
logstash-codec-es_bulk (3.1.0)
logstash-codec-fluent (3.4.2)
logstash-codec-graphite (3.0.6)
logstash-codec-json (3.1.1)
logstash-codec-json_lines (3.1.0)
logstash-codec-line (3.1.1)
logstash-codec-msgpack (3.1.0)
logstash-codec-multiline (3.1.1)
logstash-codec-netflow (4.3.0)
logstash-codec-plain (3.1.0)
logstash-codec-rubydebug (3.1.0)
logstash-filter-aggregate (2.10.0)
logstash-filter-anonymize (3.0.6)
logstash-filter-cidr (3.1.3)
logstash-filter-clone (4.2.0)
logstash-filter-csv (3.1.1)
logstash-filter-date (3.1.15)
logstash-filter-de_dot (1.0.4)
logstash-filter-dissect (1.2.5)
logstash-filter-dns (3.2.0)
logstash-filter-drop (3.0.5)
logstash-filter-elasticsearch (3.13.0)
logstash-filter-fingerprint (3.4.2)
logstash-filter-geoip (7.2.13)
logstash-filter-grok (4.4.3)
logstash-filter-http (1.4.3)
logstash-filter-json (3.2.0)
logstash-filter-json_encode (3.0.3)
logstash-filter-kv (4.7.0)
logstash-filter-memcached (1.1.0)
logstash-filter-metrics (4.0.7)
logstash-filter-mutate (3.5.6)
logstash-filter-prune (3.0.4)
logstash-filter-ruby (3.1.8)
logstash-filter-sleep (3.0.7)
logstash-filter-split (3.1.8)
logstash-filter-syslog_pri (3.2.0)
logstash-filter-throttle (4.0.4)
logstash-filter-translate (3.4.0)
logstash-filter-truncate (1.0.5)
logstash-filter-urldecode (3.0.6)
logstash-filter-useragent (3.3.4)
logstash-filter-uuid (3.0.5)
logstash-filter-xml (4.2.0)
logstash-input-azure_event_hubs (1.4.4)
logstash-input-beats (6.5.0)
└── logstash-input-elastic_agent (alias)
logstash-input-couchdb_changes (3.1.6)
logstash-input-dead_letter_queue (2.0.0)
logstash-input-elasticsearch (4.16.0)
logstash-input-exec (3.6.0)
logstash-input-file (4.4.4)
logstash-input-ganglia (3.1.4)
logstash-input-gelf (3.3.2)
logstash-input-generator (3.1.0)
logstash-input-graphite (3.0.6)
logstash-input-heartbeat (3.1.1)
logstash-input-http (3.6.1)
logstash-input-http_poller (5.4.0)
logstash-input-imap (3.2.0)
logstash-input-jms (3.2.2)
logstash-input-kinesis (2.2.1)
logstash-input-pipe (3.1.0)
logstash-input-redis (3.7.0)
logstash-input-snmp (1.3.1)
logstash-input-snmptrap (3.1.0)
logstash-input-stdin (3.4.0)
logstash-input-syslog (3.6.0)
logstash-input-tcp (6.3.2)
logstash-input-twitter (4.1.0)
logstash-input-udp (3.5.0)
logstash-input-unix (3.1.2)
logstash-integration-aws (7.1.1)
 ├── logstash-codec-cloudfront
 ├── logstash-codec-cloudtrail
 ├── logstash-input-cloudwatch
 ├── logstash-input-s3
 ├── logstash-input-sqs
 ├── logstash-output-cloudwatch
 ├── logstash-output-s3
 ├── logstash-output-sns
 └── logstash-output-sqs
logstash-integration-elastic_enterprise_search (2.2.1)
 ├── logstash-output-elastic_app_search
 └──  logstash-output-elastic_workplace_search
logstash-integration-jdbc (5.4.1)
 ├── logstash-input-jdbc
 ├── logstash-filter-jdbc_streaming
 └── logstash-filter-jdbc_static
logstash-integration-kafka (10.12.0)
 ├── logstash-input-kafka
 └── logstash-output-kafka
logstash-integration-rabbitmq (7.3.1)
 ├── logstash-input-rabbitmq
 └── logstash-output-rabbitmq
logstash-output-csv (3.0.8)
logstash-output-elasticsearch (11.13.1)
logstash-output-email (4.1.1)
logstash-output-file (4.3.0)
logstash-output-graphite (3.1.6)
logstash-output-http (5.5.0)
logstash-output-lumberjack (3.1.9)
logstash-output-nagios (3.0.6)
logstash-output-null (3.0.5)
logstash-output-opensearch (2.0.0)
logstash-output-pipe (3.0.6)
logstash-output-redis (5.0.0)
logstash-output-stdout (3.1.4)
logstash-output-tcp (6.1.1)
logstash-output-udp (3.2.0)
logstash-output-webhdfs (3.0.6)
logstash-patterns-core (4.3.4)

JVM (e.g. java -version): not installed in docker image

OS version : Linux 9c7bb12feea2 5.10.47-linuxkit #1 SMP Sat Jul 3 21:51:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

  • Logstash pod stops processing but keeps running repeating the included log lines
  • We have to kill the pods in order for it to continue
  • This is an intermittent problem that occurs in all 5 environments since moving from 7.10.1 to 8.7.1
  • It only occurs when DLQ is enabled
  • Happens to one pod at a time
  • Probability of it occurring seems to be positively correlated to the amount of load the Logstash pods are processing.

Steps to reproduce:

Please include a minimal but complete recreation of the problem,
including (e.g.) pipeline definition(s), settings, locale, etc. The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.
Environment variables:

    - name: LOG_LEVEL
      value: info
    - name: LS_JAVA_OPTS
      value: -Xmx1g -Xms1g -Dnetworkaddress.cache.ttl=0 -Dlog4j2.formatMsgNoLookups=true
    - name: QUEUE_TYPE
      value: persisted
    - name: PATH_QUEUE
      value: /logstash-data/persisted_queue
    - name: PIPELINE_BATCH_SIZE
      value: "500"
    - name: XPACK_MONITORING_ENABLED
      value: "false"
    - name: DEAD_LETTER_QUEUE_ENABLE
      value: "true"
    - name: PATH_DEAD_LETTER_QUEUE
      value: /usr/share/logstash/data/dead_letter_queue
    - name: DEAD_LETTER_QUEUE_MAX_BYTES
      value: "209715200"
input {
  kinesis {
    kinesis_stream_name => "kinesis-stream"
    initial_position_in_stream => "LATEST"
    application_name => "application_name"
    region => "eu-west-2"
    codec => json { ecs_compatibility => "disabled" }
    additional_settings => {"initial_lease_table_read_capacity" => 10 "initial_lease_table_write_capacity" => 50}
  }
}

input {
  dead_letter_queue {
    id => "kubernetes_dlq"
    path => "/usr/share/logstash/data/dead_letter_queue"
    sincedb_path => "/usr/share/logstash/data/sincedb_dlq"
    commit_offsets => true
    clean_consumed => true
    tags => ["dlq"]
  }
}

output {
    opensearch {
      id => "kubernetes_es"
      index => "%{[@metadata][index_field]}"
      hosts => [ "https://endpoint:443" ]
      manage_template => false
      ssl => true
      timeout => 200
      retry_initial_interval => 100
      retry_max_interval => 900
      user => "${LS_ELASTICSEARCH_USER}"
      password => "${LS_ELASTICSEARCH_PASSWORD}"
      validate_after_inactivity => 60
    }
}

I redacted filters etc, pl

Provide logs (if relevant):
The following log lines get repeated several times a second

[2023-06-01T14:35:33,894][WARN ][org.logstash.common.io.DeadLetterQueueWriter] unable to finalize segment
java.nio.file.NoSuchFileException: /usr/share/logstash/data/dead_letter_queue/main/10030.log.tmp -> /usr/share/logstash/data/dead_letter_queue/main/10030.log
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
        at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:266) ~[?:?]
        at java.nio.file.Files.move(Files.java:1432) ~[?:?]
        at org.logstash.common.io.DeadLetterQueueWriter.sealSegment(DeadLetterQueueWriter.java:500) ~[logstash-core.jar:?]
        at org.logstash.common.io.DeadLetterQueueWriter.finalizeSegment(DeadLetterQueueWriter.java:486) ~[logstash-core.jar:?]
        at org.logstash.common.io.DeadLetterQueueWriter.flushCheck(DeadLetterQueueWriter.java:462) ~[logstash-core.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions