Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I need a final recipe from [warn]: dump an error event: error_class=Fluent::Plugin::ConcatFilter::TimeoutError error="Timeout flush: kubernetes.var.log #83

Open
DmitriyProkhorov opened this issue Nov 25, 2019 · 4 comments

Comments

@DmitriyProkhorov
Copy link

DmitriyProkhorov commented Nov 25, 2019

#### Problem
I am setting up an optional fluentd filter that uses the concat plugin. After adding a new filter, I got a lot of errors. I see that the concat cannot process many messages and I began to lose logs

2019-11-25 17:25:58 +0000 [warn]: dump an error event: error_class=Fluent::Plugin::ConcatFilter::TimeoutError error="Timeout flush: kubernetes.var.log.containers.core-deployment-prod-8459fd75c7-x4vq2_core-prod_core-prod-279427d134fe033554565456345354564895667830d6.log:" location=nil tag="kubernetes.var.log.containers.core-deployment-prod-8459fd75c7-x4vq2_core-prod_core-prod-279427d134fe033554565456345354564895667830d6.log" time=2019-11-25 17:25:58.009520360 +0000 record={"log"=>"2019-11-25 17:25:47 [WRN] QuestionSalePointService: BatchCreateOrUpdateAsync: finish Memory usage:379.089324951172 <s:>\n", "stream"=>"stdout"}

I found workarounds on the net, but they do not help me:
#37
fluent/fluentd#2587
#4
https://stackoverflow.com/questions/37159521/flush-timeouterror-in-fluentd

Steps to replicate

The part of the config that is responsible for this filter is:

<filter kubernetes.var.log.containers.core-deployment-**>
  @type concat
  key log
  stream_identity_key tag
  multiline_start_regexp /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<level>[^\]\\]+)\] (?<message>.*)/
  flush_interval 10s
</filter>

Expected Behavior

After adding an additional filter to the original fluentd config https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml, I start to lose logs with an error

Your environment

K8S 1.13.5

  • OS version
    I use quay.io/fluentd_elasticsearch/fluentd:v2.7.0 docker image
    It is EFK solution for K8S

  • paste result of fluentd --version or td-agent --version
    fluentd 1.6.3

  • plugin version

    • paste boot log of fluentd or td-agent
    • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock
      *** LOCAL GEMS ***

activesupport (5.2.3)
addressable (2.6.0)
bigdecimal (1.2.8)
concurrent-ruby (1.1.5)
cool.io (1.5.4)
did_you_mean (1.0.0)
dig_rb (1.0.1)
domain_name (0.5.20190701)
elasticsearch (7.3.0)
elasticsearch-api (7.3.0)
elasticsearch-transport (7.3.0)
excon (0.65.0)
faraday (0.15.4)
ffi (1.11.1)
fluent-plugin-concat (2.4.0)
fluent-plugin-detect-exceptions (0.0.12)
fluent-plugin-elasticsearch (3.5.4)
fluent-plugin-kubernetes_metadata_filter (2.2.0)
fluent-plugin-multi-format-parser (1.0.0)
fluent-plugin-prometheus (1.4.0)
fluent-plugin-systemd (1.0.2)
fluentd (1.6.3)
http (0.9.8)
http-cookie (1.0.3)
http-form_data (1.0.3)
http_parser.rb (0.6.0)
i18n (1.6.0)
io-console (0.4.5)
json (1.8.3)
kubeclient (1.1.4)
lru_redux (1.1.0)
mime-types (3.2.2)
mime-types-data (3.2019.0331)
minitest (5.11.3, 5.9.0)
msgpack (1.3.0)
multi_json (1.13.1)
multipart-post (2.1.1)
net-telnet (0.1.1)
netrc (0.11.0)
oj (3.8.1)
power_assert (0.2.7)
prometheus-client (0.9.0)
psych (2.1.0)
public_suffix (3.1.1)
quantile (0.2.1)
rake (10.5.0)
rdoc (4.2.1)
recursive-open-struct (1.0.0)
rest-client (2.0.2)
serverengine (2.1.1)
sigdump (0.2.4)
strptime (0.2.3)
systemd-journal (1.3.3)
test-unit (3.1.7)
thread_safe (0.3.6)
tzinfo (1.2.5)
tzinfo-data (1.2019.2)
unf (0.1.4)
unf_ext (0.0.7.6)
yajl-ruby (1.4.1)

@letmepew
Copy link

I am also stuck in same issue. Multiline log parser is not working in K8S.

@okkez
Copy link
Member

okkez commented Jan 15, 2020

This is because you use only multiline_start_regexp.
If you use only multiline_start_regexp, this plugin will wait for the next line matching multiline_start_regexp.
Therefore you can use multiline_end_regexp or continuous_line_regexp to handle multiline logs perfectly. In other words, you know your multiline logs completely.
Or, you can use timeout_label configuration to handle Fluent::Plugin::ConcatFilter::TimeoutError, if you don't know how to handle multiline logs using multiline_end_regexp or continuous_line_regexp.

@Prakashreddy134
Copy link

Prakashreddy134 commented Mar 18, 2020

Hi @okkez
i am using the multiline_end_regexp still i see error.
<filter tail.containers.var.log.containers.test.log>
@type concat
key log
timeout_label @splunk
stream_identity_key stream
multiline_start_regexp /^\d{4}-\d{2}-\d{2}/
multiline_end_regexp /\n$/
flush_interval 5s
separator ""
use_first_timestamp true

How can we fix this issue?
im using splunk connect for kubernetes

@halr9000
Copy link

@Prakashreddy134 if you haven't already, I suggest logging an issue over on https://github.com/splunk/splunk-connect-for-kubernetes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants