Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement new parser for parser_syslog #2599
Implement new parser for parser_syslog #2599
Changes from 2 commits
68f3ec7
f8af924
302852e
94887d8
4154272
6fc5fb4
2bc4294
60c838f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pri
can be0
ifi < 2
ortext.slice(1, i - 1)
contains not number charactors. Is it acceptable?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://tools.ietf.org/html/rfc3164#section-4.1.1
Need it check
i
is less than 5?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because the purpose of this parser is for supporting more format, not strict parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of format do we want to support?
If this parser does not follow the rfc3164, it's probably better to use new
message_format
name (rfc3164_ext or like that).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is existing products uses
rfc3164
for it. rfc3164 describes the collection of existing message format and BSD spec but many existing tools doesn't follow rfc3164 strictly. This parser can support strict rfc3164 format, so it doesn't block user's usecase for me.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rfc3164 seems not to support subsecond time https://tools.ietf.org/html/rfc3164#section-4.1.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but some products send rfc3164 syslog message with subsecond time.
regexp version also supports subsecond time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then, why don't you change the method name? this is not
parse_rfc3164
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment it on above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the text is
${TIMESTAMP} ${HOST} test:val
, this parser won't work. and I thinktest:val
is valid syslog message.ident
should betest
andmsg
should beval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very difficult problem because rfc3164 doesn't define how ident and message are separated. For the experience, here is an actual examples:
So to support both cases, adding option is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://tools.ietf.org/html/rfc3164#section-4.1.3
So , Need it check the size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need because this is not strict parser and strict parser for rfc3164 is useless on the production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you use String#bytesize? if
text
isテキスト
(multi bytes chars),text.bytesize
is 12.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This slice is for getting remaining data from
text
.We can't use
slice(i + 1, -1)
, so I usebytesize
for fast calculation.Other good method?
slice(i + 1..-1)
is slower...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, ident(TAG in rfc) does not exist, does message follow the rfc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure but some product sends such message. And this is good for capturing all MSG broken syslog messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time_str was sliced with size(15), right? why do we need to call String#squeeze?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember the actual product name but users hit the problem with the number of ' '.
regexp version also calls
squeeze
for avoiding this problem.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value.gsub(/ +/, "")
is introduced by #68 ce1c410But we don't need using
String#squeeze
to parse time string including redundant spaces.Now we can remove
String#squeeze
here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing regex does not specify the size of chars. but here specifies the fixed size(15).
So if the character which needs to call
squeeze
is passed to this filter plugin, the parser will fail to parse it in any case, right?.