You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When in_syslog (with default syslog parser) is unable to parse a line, effectively it gets thrown away. (Example at end of ticket).
It does generate an internal warning, of the form:
[warn]: #0 failed to parse message data="...."
but that does not go through the same processing path as the syslog messages themselves, and is not itself in a form which could be easily (re-)parsed.
I would like to be confident that all syslog messages are retained, even ones which are of an unexpected format.
Describe the solution you'd like
in_tail has the ability to emit_unmatched_lines, and it would be consistent to add something like this for in_syslog / in_udp (and possibly all other input modules?)
It might be nice to be able to set the key they are emitted under: e.g.
emit_unmatched_lines bad_message
And/or add an extra key such as "unparsed": true.
Describe alternatives you've considered
As a workaround, I can use in_udp with <parser>@type none</parser>. That gives me the raw syslog reliably, but of course completely unparsed (e.g. it includes the <pri> header)
In theory I could combine this with a parser filter and emit_invalid_record_to_error. However that's very difficult to set up (*), and in any case <label @ERROR> catches other errors (e.g. buffer full) that I don't want to catch.
If I could change the label used when parsing fails, e.g. <label @UNPARSED>, that could be a workable solution.
Another solution might be to allow a chain of parsers, with parsing stopping on the first match. You could end with parser none, or with parser regexp and expression (.*), as a catch-all. This is not currently supported - you get error "duplicated parsers configured" if you put more than one <parse> section in an input module.
(*) It's not obvious how to strip/relabel without getting into infinite loops. For example, the following gives an infinite loop:
This gives the following infinite loop of errors, which I can't see how to avoid:
...
Jul 14 16:41:28 fluentd fluentd[14652]: 2019-07-14 16:41:28 +0000 [warn]: #0 send an error event to @ERROR: error_class=Fluent::Plugin::Parser::ParserError error="parse failed invalid time format: value = <30>Jul 14 16:41:28, error_class = ArgumentError, error = string doesn't match" location="/var/lib/gems/2.3.0/gems/fluentd-1.6.2/lib/fluent/plugin/filter_parser.rb:110:in `rescue in filter_with_time'" tag="syslog" time=2019-07-14 16:41:28.366510476 +0000
Jul 14 16:41:28 fluentd fluentd[14652]: 2019-07-14 16:41:28 +0000 [warn]: #0 send an error event to @ERROR: error_class=Fluent::Plugin::Parser::ParserError error="parse failed invalid time format: value = <30>Jul 14 16:41:28, error_class = ArgumentError, error = string doesn't match" location="/var/lib/gems/2.3.0/gems/fluentd-1.6.2/lib/fluent/plugin/filter_parser.rb:110:in `rescue in filter_with_time'" tag="syslog" time=2019-07-14 16:41:28.368301612 +0000
Jul 14 16:41:28 fluentd fluentd[14652]: 2019-07-14 16:41:28 +0000 [warn]: #0 send an error event to @ERROR: error_class=Fluent::Plugin::Parser::ParserError error="parse failed invalid time format: value = <30>Jul 14 16:41:28, error_class = ArgumentError, error = string doesn't match" location="/var/lib/gems/2.3.0/gems/fluentd-1.6.2/lib/fluent/plugin/filter_parser.rb:110:in `rescue in filter_with_time'" tag="syslog" time=2019-07-14 16:41:28.370026582 +0000
...
Additional context
Here is a real raw syslog generated by a Netgear GS724Tv4 switch, captured by in_udp with parser none:
2019-07-14T16:01:51+00:00 syslog {"message":"<14> Jul 14 16:01:51 10.12.255.2-1 CLI_WEB[53145316]: login_sessions.c(172) 117489 %% Telnet Session 0 ended for user admin connected from 10.12.255.248\n\u0000","source_address":"10.12.255.2"}
Notice how there is a spurious space after the <pri> prefix, and a spurious \u0000 at the end. I'll agree that this device is broken, but broken things exist in the real world.
If I try to receive this using in_syslog with the default syslog parser, I get the following backtrace barf in local syslog (for every message!)
Is your feature request related to a problem? Please describe.
When in_syslog (with default syslog parser) is unable to parse a line, effectively it gets thrown away. (Example at end of ticket).
It does generate an internal warning, of the form:
but that does not go through the same processing path as the syslog messages themselves, and is not itself in a form which could be easily (re-)parsed.
I would like to be confident that all syslog messages are retained, even ones which are of an unexpected format.
Describe the solution you'd like
in_tail has the ability to
emit_unmatched_lines
, and it would be consistent to add something like this for in_syslog / in_udp (and possibly all other input modules?)It might be nice to be able to set the key they are emitted under: e.g.
And/or add an extra key such as
"unparsed": true
.Describe alternatives you've considered
As a workaround, I can use
in_udp
with<parser>@type none</parser>
. That gives me the raw syslog reliably, but of course completely unparsed (e.g. it includes the<pri>
header)In theory I could combine this with a parser filter and
emit_invalid_record_to_error
. However that's very difficult to set up (*), and in any case<label @ERROR>
catches other errors (e.g. buffer full) that I don't want to catch.If I could change the label used when parsing fails, e.g.
<label @UNPARSED>
, that could be a workable solution.Another solution might be to allow a chain of parsers, with parsing stopping on the first match. You could end with parser
none
, or with parserregexp
and expression(.*)
, as a catch-all. This is not currently supported - you get error "duplicated parsers configured" if you put more than one<parse>
section in an input module.(*) It's not obvious how to strip/relabel without getting into infinite loops. For example, the following gives an infinite loop:
This gives the following infinite loop of errors, which I can't see how to avoid:
Additional context
Here is a real raw syslog generated by a Netgear GS724Tv4 switch, captured by
in_udp
with parsernone
:Notice how there is a spurious space after the
<pri>
prefix, and a spurious\u0000
at the end. I'll agree that this device is broken, but broken things exist in the real world.If I try to receive this using
in_syslog
with the default syslog parser, I get the following backtrace barf in local syslog (for every message!)If I send something which is more obviously not in standard syslog format, e.g.
then I just get a single warning line logged:
In both cases, I would like the broken syslog message to propagate through the stack, albeit with some way to identify it as non-parsed.
Another example is Cisco's weird syslog format
The text was updated successfully, but these errors were encountered: