Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluentd is not able to start inside EKS pod #583

Closed
OleksandrMykytenko opened this issue Jul 6, 2021 · 16 comments · Fixed by #585 or fluent/fluentd#3451
Closed

fluentd is not able to start inside EKS pod #583

OleksandrMykytenko opened this issue Jul 6, 2021 · 16 comments · Fixed by #585 or fluent/fluentd#3451

Comments

@OleksandrMykytenko
Copy link

OleksandrMykytenko commented Jul 6, 2021

Today's morning couple of our EKS workers have been updated to the latest EKS Worker AMI and suddenly fluentd pods failed to start on those new nodes with the following error:

/fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:126:in `parse_element': private method `warn' called for nil:NilClass (NoMethodError)
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:95:in `parse_element'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:95:in `parse_element'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:95:in `parse_element'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:168:in `block in eval_include'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:162:in `each'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:162:in `eval_include'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:145:in `parse_include'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:104:in `parse_element'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:43:in `parse!'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:33:in `parse'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config.rb:58:in `parse'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config.rb:39:in `build'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/supervisor.rb:613:in `initialize'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/command/fluentd.rb:342:in `new'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/command/fluentd.rb:342:in `<top (required)>'
        from /usr/local/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
        from /usr/local/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
        from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/bin/fluentd:15:in `<top (required)>'
        from /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd:23:in `load'
        from /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd:23:in `<main>'

Image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch

UPD: it's not related to new EKS AMI, it came from new version on image... Let me try to pin image tag to some older versions

UPD 2: it works as expected on previous image version (fluent/fluentd-kubernetes-daemonset:v1.12-debian-elasticsearch7-1)

ashie added a commit that referenced this issue Jul 7, 2021
Follow up for #534
Fix #583

Signed-off-by: Takuro Ashie <[email protected]>
ashie added a commit that referenced this issue Jul 7, 2021
Follow up for #534
Fix #583

Signed-off-by: Takuro Ashie <[email protected]>
@ashie ashie closed this as completed in #585 Jul 7, 2021
@ashie ashie reopened this Jul 7, 2021
@ashie
Copy link
Member

ashie commented Jul 7, 2021

Sorry, I linked to this wrongly from #585.

@OleksandrMykytenko
Copy link
Author

yeah, just checked latest tag v1-debian-elasticsearch and it keeps failing...

@kenhys
Copy link
Contributor

kenhys commented Jul 8, 2021

hmm, does it occur even though the default daemonset image? (I mean apply -f without additional configMap )
Does kubectl logs (pod name) show something weird before above error?

@OleksandrMykytenko
Copy link
Author

hello, checked w/o ConfigMap and it was able to proceed with startup but failed on some K8S API calls which is expected as pod does not have any SA assigned.

regarding the second question - it doesn't, i can get only the error I pasted into issue

@ashie
Copy link
Member

ashie commented Jul 8, 2021

https://github.com/fluent/fluentd/blob/c62dc312eedb08bcd124c58042125e98385e1d6e/lib/fluent/config/v1_parser.rb#L121-L127

121                if k.start_with?('@')
122                  if root_element || ELEM_SYMBOLS.include?(elem_name)
123                    parse_error! "'@' is the system reserved prefix. Don't use '@' prefix parameter in the configuration: #{k}"
124                  else
125                    # TODO: This is for backward compatibility. It will throw an error in the future.                                     
126                    $log.warn "'@' is the system reserved prefix. It works in the nested configuration for now but it will be rejected: #{k}"
127                  end
128                end

Hmm, there are 2 problems.

  • Probably you have an invalid config parameter which starts with @. Please find it and remove @.
  • $log shouldn't be nil. It's a bug of fluentd as of 1.13.0. I confirmed that it doesn't reproduce with 1.12.4

I used the following config to confirm it:

<source>
  @type tail
  path /home/aho/Projects/Fluentd/fluentd/issues/3269/*.log
  tag foo.bar
  <parse>
    @type json
    @time_key nil  # <<<<==== This line shouldn't include "@" =====>>>>>>
  </parse>
  @label @OUTPUT
</source>

<label @OUTPUT>
  <match **>
    @type stdout
  </match>
</label>

@ashie
Copy link
Member

ashie commented Jul 8, 2021

* `$log` shouldn't be `nil`. It's a bug of fluentd as of 1.13.0. I confirmed that it doesn't reproduce with 1.12.4

It's caused by this commit: fluent/fluentd@b89c9f6

ashie added a commit to fluent/fluentd that referenced this issue Jul 8, 2021
By #3352, an early stage config parser is introduced to enable setting
log rotation by system config. While this stage, the global logger
doesn't exist yet, so that it causes crash when the config parser try to
output log. This commit avoid the crash by checking $log. Same logs will
be output at the later stage so that it's not needed in the early stage.

See also: fluent/fluentd-kubernetes-daemonset#583

Signed-off-by: Takuro Ashie <[email protected]>
@ashie
Copy link
Member

ashie commented Jul 8, 2021

It will be fixed by fluent/fluentd#3451

@OleksandrMykytenko
Copy link
Author

OleksandrMykytenko commented Jul 8, 2021

I reviewed config and found that I have only one suspicious statement:

time_key @timestamp

Full block:

<filter apps.** app-1.**>
        @type parser
        key_name log
        reserve_data true
        <parse>
          @type json
          time_key @timestamp
          keep_time_key true
        </parse>
      </filter>

I cannot change it right now as it's based on application logs which provide @timestamp field in each log message.

@ashie
Copy link
Member

ashie commented Jul 8, 2021

I reviewed config and found that I have only one suspicious statement:

time_key @timestamp

No problem @ in config value, it's not a cause of this issue.
Probably something other with prefixed @ exists in your config.

@ashie
Copy link
Member

ashie commented Jul 9, 2021

Fixed in fluentd but gem & fluentd-kubernetes-daemonset aren't released yet.

@OleksandrMykytenko
Copy link
Author

Found one more suspicious thing:

<filter app.**>
        @type record_transformer
        <record>
            logger_name ${record["loggerName"]}
            thread_name ${record["threadName"]}
            @timestamp ${record["timestamp"]}
        </record>
        remove_keys loggerName,threadName,timestamp
</filter>

All other fields seem to be valid config parameters such as type, log_level, id and so on.

@ashie
Copy link
Member

ashie commented Jul 9, 2021

Ah, I got it. It's not also replaceable for you, isn't it?
It's seems a fault of Fluentd's design, we need to add an attribute to ignore @ validation...

@OleksandrMykytenko
Copy link
Author

yeah, it comes from app logging and we need to use this transformation... anyway we're good to go with version 1.12 and wait for fixes :) thank you for investigation and response :)

@dawany
Copy link

dawany commented Jul 11, 2021

I have the same issue with fluent/fluentd-kubernetes-daemonset:v1.13.1-debian-elasticsearch7-3.0

$ kubectl logs -f fluentd-5kq6s -n fluentd-dylan 
/fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:102:in `parse_element': private method `warn' called for nil:NilClass (NoMethodError)
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:43:in `parse!'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:33:in `parse'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config.rb:58:in `parse'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config.rb:39:in `build'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/supervisor.rb:613:in `initialize'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/command/fluentd.rb:342:in `new'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/command/fluentd.rb:342:in `<top (required)>'
	from /usr/local/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
	from /usr/local/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
	from /fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/bin/fluentd:15:in `<top (required)>'
	from /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd:23:in `load'
	from /fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd:23:in `<main>'

here is my config

pods-fluent.conf: |-
    <source>
      @type tail
      read_from_head true
      tag kubernetes.*
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      exclude_path ["/var/log/containers/fluent*"]
      <parse>
        @type kubernetes
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      </parse>
    </source>

@ashie
Copy link
Member

ashie commented Jul 12, 2021

/fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.13.1/lib/fluent/config/v1_parser.rb:102:in parse_element': private method warn' called for nil:NilClass (NoMethodError)

It's caused by using include instead of @include, it will be also fixed in 1.13.2 (not released yet):

https://github.com/fluent/fluentd/blob/da162e15bc95b1263f92eb0622c714f8b9d303b9/lib/fluent/config/v1_parser.rb#L101-L103

            if !prev_match.start_with?('@')
              $log.warn "'include' is deprecated. Use '@include' instead"
            end

@ashie
Copy link
Member

ashie commented Jul 16, 2021

v1.13.2 images have been released.

@ashie ashie closed this as completed Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants