Add tag "multiline" to "log.flags" if event consists of multiple lines.#7997
Add tag "multiline" to "log.flags" if event consists of multiple lines.#7997kvch merged 13 commits intoelastic:masterfrom
Conversation
andrewvc
left a comment
There was a problem hiding this comment.
LGTM.
Tested manually as well for good measure
andrewvc
left a comment
There was a problem hiding this comment.
Spoke too soon. Looks like there are some test failures:
07:04:43 ======================================================================
07:04:43 FAIL: Should be able to do multiline on docker logs.
07:04:43 ----------------------------------------------------------------------
07:04:43 Traceback (most recent call last):
07:04:43 File "/private/var/lib/jenkins/workspace/elastic+beats+pull-request+multijob-darwin/beat/filebeat/label/macosx/src/github.com/elastic/beats/filebeat/tests/system/test_json.py", line 95, in test_docker_logs_multiline
07:04:43 assert all("log" in o for o in output)
07:04:43 AssertionError
07:04:43
There was a problem hiding this comment.
Can a multiline event contain only 1 line? Reason I'm asking is because I was thinking what if someone wants to see all mutiline events. So far I assumed it would be exists:log.multiline.
But your idea below kind of also answers this. If we call it log.lines we should probably add it to all events, not only mutiline events?
There was a problem hiding this comment.
The original request in the issue I am trying to close states "This is to mimic the behaviour the logstash multiline filter performs by adding 'multiline' to tags.". As logstash only tags events which contain more than one lines. See: https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html#plugins-codecs-multiline-multiline_tag
|
Personally I like |
|
+1 on the idea from @urso to call it |
|
I also prefer I personally prefer the consistency of always having the field present. |
|
Many people never touch multi-line events, so it's hard to imagine why we would start always setting this value. What's the use case? Every field we add increases the cognitive overhead of using the product. The only use case for this value I've encountered is indicating the operation of the multiline option. If there are more use cases, that would be great to hear about. Without those however, keeping the naming Using this value beyond the specific use of the multiline feature raises a number of questions. You could have a JSON input source creating 'multiline' values for the message field, yet this value would remain. What happens with a JSON value with no message field, but other fields with multiple newline characters? If we really want to start counting the number of lines in every message, then an enumeration of the use cases for this new feature would be useful to this discussion. WRT whether it should be only values Ideally this field wouldn't even exist in the mapping unless a multiline filter was enabled. This is beyond the scope of this ticket, but is relevant to #7972 |
The way the reader works, the count field would only be added if multiline is configured. No multiline, not extra-field. |
|
@urso Ah, my misunderstanding, I had thought we were considering expanding the scope. Glad to hear that's not on the table. |
|
@ruflin For the record I propose |
log.multiline if event consists of multiple linesevent.multiline if event consists of multiple lines
event.multiline if event consists of multiple linesevent.lines if event consists of multiple lines
|
As for the docker case: The docker input "should" already solve the issue by renaming it to message. And if @kvch Sorry about the |
|
I think I might be made a mistake with my previous explanation. Let me check again. But still my statement that if I change message_key option in that test, nothing is parsed. |
|
No, my description is correct. |
|
Did you use the Which test are you referring to? In general I think that docker write to |
f75ccad to
f806161
Compare
event.lines if event consists of multiple lineslog.lines if event consists of multiple lines
|
The failing test is not relevant anymore, because for Docker JSON logs, I still have one more question. Do we want to mimic Logstash's behaviour or add |
|
Both options work for me. @exekias For the docker prospector if we have this partial lines, should we also use the multiline count or is this a different kind of multiline? |
|
Docker partials are just something internal to the format, it's still a single line in source and should be a single line in output. So would say it should not include this field in that case. If you do common multiline on top it should contain this field, but I think that's covered by this change already. |
f806161 to
35b8e44
Compare
log.lines if event consists of multiple lines35b8e44 to
d59613d
Compare
ruflin
left a comment
There was a problem hiding this comment.
Let's get the other PR and then rebase this on top which will be very little changes left like it seems.
There was a problem hiding this comment.
See comment about naming in the other thread.
There was a problem hiding this comment.
Would be nice to have a test for a multiline event that also gets truncated and then check for both flags.
d2189b2 to
1c48f06
Compare
1c48f06 to
1df5455
Compare
|
Rebased. |
| """ | ||
| Should be able to do multiline on docker logs. | ||
| """ | ||
| self.render_config_template( |
There was a problem hiding this comment.
- I think this test is still valuable to make sure we support multiline logs in docker. What we should change is to set
keys_under_root=Falseand then check for the fields accordingly - If we remove the test, we should also remove the test file as it's not used anywhere else.
There was a problem hiding this comment.
For docker we have a dedicated input type in Filebeat. Do users prefer to read Docker files using log and use json.* options? What is the "official" way read Docker logs?
There was a problem hiding this comment.
The "official" was has become the docker input I would say. This doesn't mean users do not still use the old behaviour and I'm not even sure if we have such a multiline test for the docker input.
…s. (elastic#7997) Add "multiline" tag to "log.status" if the event contains multiple lines. This way users can filter for multiline messages using "multiline" in [log.status]. Example event { "@timestamp": "2018-08-17T11:35:21.813Z", "@metadata": { "beat": "filebeat", "type": "doc", "version": "7.0.0-alpha1" }, "source": "/home/n/test.log", "offset": 0, "log": { "status": [ "multiline" ], }, "message": "[test line\ntest line]", "prospector": { "type": "log" }, "input": { "type": "log" }, "beat": { "hostname": "sleipnir", "version": "7.0.0-alpha1", "name": "sleipnir" }, "host": { "name": "sleipnir" } } Closes elastic#957 (cherry picked from commit 6da83e8)
…s. (#7997) (#8207) Add "multiline" tag to "log.status" if the event contains multiple lines. This way users can filter for multiline messages using "multiline" in [log.status]. Example event { "@timestamp": "2018-08-17T11:35:21.813Z", "@metadata": { "beat": "filebeat", "type": "doc", "version": "7.0.0-alpha1" }, "source": "/home/n/test.log", "offset": 0, "log": { "status": [ "multiline" ], }, "message": "[test line\ntest line]", "prospector": { "type": "log" }, "input": { "type": "log" }, "beat": { "hostname": "sleipnir", "version": "7.0.0-alpha1", "name": "sleipnir" }, "host": { "name": "sleipnir" } } Closes #957 (cherry picked from commit 6da83e8)
Add "multiline" tag to "log.status" if the event contains multiple lines. This way users can filter for multiline messages using
"multiline" in [log.status].Example event
{ "@timestamp": "2018-08-17T11:35:21.813Z", "@metadata": { "beat": "filebeat", "type": "doc", "version": "7.0.0-alpha1" }, "source": "/home/n/test.log", "offset": 0, "log": { "status": ["multiline"], }, "message": "[test line\ntest line]", "prospector": { "type": "log" }, "input": { "type": "log" }, "beat": { "hostname": "sleipnir", "version": "7.0.0-alpha1", "name": "sleipnir" }, "host": { "name": "sleipnir" } }Depends on #7991
Closes #957