Skip to content

[BUG][CSV Output] Logstash CSV output not flushing to disk #14869

@lduvnjak

Description

@lduvnjak

Logstash information:

Please include the following information:

  • Logstash version (e.g. bin/logstash --version)
    logstash 8.5.0

  • Logstash installation source (e.g. built from source, with a package manager: DEB/RPM, expanded from tar or zip archive, docker)
    yum install logstash

  • How is Logstash being run (e.g. as a service/service manager: systemd, upstart, etc. Via command line, docker/kubernetes)
    CLI : nohup /usr/share/logstash/bin/logstash -f /root/logstash/exporter/exporter.conf > f.out 2> f.err < /dev/null &

Plugins installed:

logstash-codec-avro (3.4.0)
logstash-codec-cef (6.2.5)
logstash-codec-collectd (3.1.0)
logstash-codec-dots (3.0.6)
logstash-codec-edn (3.1.0)
logstash-codec-edn_lines (3.1.0)
logstash-codec-es_bulk (3.1.0)
logstash-codec-fluent (3.4.1)
logstash-codec-graphite (3.0.6)
logstash-codec-json (3.1.0)
logstash-codec-json_lines (3.1.0)
logstash-codec-line (3.1.1)
logstash-codec-msgpack (3.1.0)
logstash-codec-multiline (3.1.1)
logstash-codec-netflow (4.2.2)
logstash-codec-plain (3.1.0)
logstash-codec-rubydebug (3.1.0)
logstash-filter-aggregate (2.10.0)
logstash-filter-anonymize (3.0.6)
logstash-filter-cidr (3.1.3)
logstash-filter-clone (4.2.0)
logstash-filter-csv (3.1.1)
logstash-filter-date (3.1.15)
logstash-filter-de_dot (1.0.4)
logstash-filter-dissect (1.2.5)
logstash-filter-dns (3.1.5)
logstash-filter-drop (3.0.5)
logstash-filter-elasticsearch (3.12.0)
logstash-filter-fingerprint (3.4.1)
logstash-filter-geoip (7.2.12)
logstash-filter-grok (4.4.2)
logstash-filter-http (1.4.1)
logstash-filter-json (3.2.0)
logstash-filter-kv (4.7.0)
logstash-filter-memcached (1.1.0)
logstash-filter-metrics (4.0.7)
logstash-filter-mutate (3.5.6)
logstash-filter-prune (3.0.4)
logstash-filter-ruby (3.1.8)
logstash-filter-sleep (3.0.7)
logstash-filter-split (3.1.8)
logstash-filter-syslog_pri (3.1.1)
logstash-filter-throttle (4.0.4)
logstash-filter-translate (3.4.0)
logstash-filter-truncate (1.0.5)
logstash-filter-urldecode (3.0.6)
logstash-filter-useragent (3.3.3)
logstash-filter-uuid (3.0.5)
logstash-filter-xml (4.2.0)
logstash-input-azure_event_hubs (1.4.4)
logstash-input-beats (6.4.1)
└── logstash-input-elastic_agent (alias)
logstash-input-couchdb_changes (3.1.6)
logstash-input-dead_letter_queue (2.0.0)
logstash-input-elasticsearch (4.16.0)
logstash-input-exec (3.6.0)
logstash-input-file (4.4.4)
logstash-input-ganglia (3.1.4)
logstash-input-gelf (3.3.2)
logstash-input-generator (3.1.0)
logstash-input-graphite (3.0.6)
logstash-input-heartbeat (3.1.1)
logstash-input-http (3.6.0)
logstash-input-http_poller (5.4.0)
logstash-input-imap (3.2.0)
logstash-input-jms (3.2.2)
logstash-input-pipe (3.1.0)
logstash-input-redis (3.7.0)
logstash-input-snmp (1.3.1)
logstash-input-snmptrap (3.1.0)
logstash-input-stdin (3.4.0)
logstash-input-syslog (3.6.0)
logstash-input-tcp (6.3.0)
logstash-input-twitter (4.1.0)
logstash-input-udp (3.5.0)
logstash-input-unix (3.1.1)
logstash-integration-aws (7.0.0)
 ├── logstash-codec-cloudfront
 ├── logstash-codec-cloudtrail
 ├── logstash-input-cloudwatch
 ├── logstash-input-s3
 ├── logstash-input-sqs
 ├── logstash-output-cloudwatch
 ├── logstash-output-s3
 ├── logstash-output-sns
 └── logstash-output-sqs
logstash-integration-elastic_enterprise_search (2.2.1)
 ├── logstash-output-elastic_app_search
 └──  logstash-output-elastic_workplace_search
logstash-integration-jdbc (5.3.0)
 ├── logstash-input-jdbc
 ├── logstash-filter-jdbc_streaming
 └── logstash-filter-jdbc_static
logstash-integration-kafka (10.12.0)
 ├── logstash-input-kafka
 └── logstash-output-kafka
logstash-integration-rabbitmq (7.3.0)
 ├── logstash-input-rabbitmq
 └── logstash-output-rabbitmq
logstash-output-csv (3.0.8)
logstash-output-elasticsearch (11.9.0)
logstash-output-email (4.1.1)
logstash-output-file (4.3.0)
logstash-output-graphite (3.1.6)
logstash-output-http (5.5.0)
logstash-output-lumberjack (3.1.9)
logstash-output-nagios (3.0.6)
logstash-output-null (3.0.5)
logstash-output-pipe (3.0.6)
logstash-output-redis (5.0.0)
logstash-output-stdout (3.1.4)
logstash-output-tcp (6.1.1)
logstash-output-udp (3.2.0)
logstash-output-webhdfs (3.0.6)
logstash-patterns-core (4.3.4)

OS Version:
Linux removed 4.18.0-425.3.1.el8.x86_64 #1 SMP Tue Nov 8 14:08:25 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

  • Expected
    Logstash should flush to disk according to the value of flush_interval
  • Actual
    Logstash never flushes to disk, and eventually runs out of memory

Steps to reproduce:

Please include a minimal but complete recreation of the problem,
including (e.g.) pipeline definition(s), settings, locale, etc. The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.

  1. Logstash Elasticsearch input for any index
  2. Logstash CSV output with any fields
  3. Logstash crash due to java.lang.OutOfMemoryError: Java heap space

Current Result:
Logstash crashes, and all the messages processed are gone.

Expected result:
Messages are processed and flushed to disk periodically

Provide logs (if relevant):

  • f.out (before 8.5)
Using bundled JDK: /usr/share/logstash/jdk
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2023-02-03 09:49:40.180 [main] runner - NOTICE: Running Logstash as superuser is not recommended and won't be allowed in the future. Set 'allow_superser' to 'false' to avoid startup errors in future releases.
[INFO ] 2023-02-03 09:49:40.189 [main] runner - Starting Logstash {"logstash.version"=>"8.4.0", "jruby.version"=>"jruby 9.3.6.0 (2.6.8) 2022-06-27 7a2cbcd376OpenJDK 64-Bit Server VM 17.0.4+8 on 17.0.4+8 +indy +jit [x86_64-linux]"}
[INFO ] 2023-02-03 09:49:40.191 [main] runner - JVM bootstrap flags: [-Xms4g, -Xmx4g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableAD=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=fle:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-xports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.ools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMD, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opes=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[WARN ] 2023-02-03 09:49:40.369 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2023-02-03 09:49:41.000 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2023-02-03 09:49:41.496 [Converge PipelineAction::Create<main>] Reflections - Reflections took 63 ms to scan 1 urls, producing 125 keys and 434 value
[INFO ] 2023-02-03 09:49:41.765 [Converge PipelineAction::Create<main>] javapipeline - Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` seting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[INFO ] 2023-02-03 09:49:41.839 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/root/logstash/exporter/exporter.conf"], :thread=>"#<Thread:0x77495840run>"}
[INFO ] 2023-02-03 09:49:42.192 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.35}
[INFO ] 2023-02-03 09:49:42.688 [[main]-pipeline-manager] elasticsearch - ECS compatibility is enabled but `target` option was not specified. This may cause ields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to void potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[INFO ] 2023-02-03 09:49:42.691 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2023-02-03 09:49:42.717 [[main]|input|elasticsearch|slice_0] elasticsearch - Slice starting {:slice_id=>0, :slices=>4}
[INFO ] 2023-02-03 09:49:42.722 [[main]|input|elasticsearch|slice_2] elasticsearch - Slice starting {:slice_id=>2, :slices=>4}
[INFO ] 2023-02-03 09:49:42.729 [[main]|input|elasticsearch|slice_3] elasticsearch - Slice starting {:slice_id=>3, :slices=>4}
[INFO ] 2023-02-03 09:49:42.730 [[main]|input|elasticsearch|slice_1] elasticsearch - Slice starting {:slice_id=>1, :slices=>4}
[INFO ] 2023-02-03 09:49:42.772 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2023-02-03 09:49:45.489 [[main]>worker0] csv - Opening file {:path=>"/root/logstash/exporter/*removed*.csv"}
  • f.out (after 8.5)
Using bundled JDK: /usr/share/logstash/jdk
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2023-02-03 09:52:34.364 [main] runner - NOTICE: Running Logstash as superuser is not recommended and won't be allowed in the future. Set 'allow_superuser' to 'false' to avoid startup errors in future releases.
[INFO ] 2023-02-03 09:52:34.373 [main] runner - Starting Logstash {"logstash.version"=>"8.5.0", "jruby.version"=>"jruby 9.3.8.0 (2.6.8) 2022-09-13 98d69c9461 OpenJDK 64-Bit Server VM 17.0.4+8 on 17.0.4+8 +indy +jit [x86_64-linux]"}
[INFO ] 2023-02-03 09:52:34.375 [main] runner - JVM bootstrap flags: [-Xms4g, -Xmx4g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[WARN ] 2023-02-03 09:52:34.548 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2023-02-03 09:52:35.179 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2023-02-03 09:52:35.651 [Converge PipelineAction::Create<main>] Reflections - Reflections took 59 ms to scan 1 urls, producing 125 keys and 438 values
[INFO ] 2023-02-03 09:52:36.200 [Converge PipelineAction::Create<main>] javapipeline - Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[INFO ] 2023-02-03 09:52:36.277 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/root/logstash/exporter/exporter.conf"], :thread=>"#<Thread:0x46ac922e run>"}
[INFO ] 2023-02-03 09:52:36.624 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.35}
[INFO ] 2023-02-03 09:52:37.163 [[main]-pipeline-manager] elasticsearch - ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[INFO ] 2023-02-03 09:52:37.185 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2023-02-03 09:52:37.232 [[main]|input|elasticsearch|slice_1] elasticsearch - Slice starting {:slice_id=>1, :slices=>4}
[INFO ] 2023-02-03 09:52:37.235 [[main]|input|elasticsearch|slice_0] elasticsearch - Slice starting {:slice_id=>0, :slices=>4}
[INFO ] 2023-02-03 09:52:37.261 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2023-02-03 09:52:37.270 [[main]|input|elasticsearch|slice_2] elasticsearch - Slice starting {:slice_id=>2, :slices=>4}
[INFO ] 2023-02-03 09:52:37.279 [[main]|input|elasticsearch|slice_3] elasticsearch - Slice starting {:slice_id=>3, :slices=>4}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid12882.hprof ...
Heap dump file created [6021510285 bytes in 52.002 secs]
[FATAL] 2023-02-03 09:55:35.506 [Agent thread] Logstash - uncaught error (in thread Agent thread)
java.lang.OutOfMemoryError: Java heap space

Additional info:
This issue happens on 8.6 as well as 8.5. Every other version between 7.17 and onwards till 8.5 works without issues.
You can find more info on exporter.conf and everything else here.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions