Skip to content

in_tail: Support multiline functionality (not filter) in dockermode#2043

Merged
edsiper merged 2 commits intofluent:masterfrom
SumoLogic:drosiek-docker-mulitline-parser
Jul 10, 2020
Merged

in_tail: Support multiline functionality (not filter) in dockermode#2043
edsiper merged 2 commits intofluent:masterfrom
SumoLogic:drosiek-docker-mulitline-parser

Conversation

@sumo-drosiek
Copy link
Contributor

@sumo-drosiek sumo-drosiek commented Mar 24, 2020

Use Parser to verify if line is valid first line while using dockermode.

For example input:

{"log":"2020-03-24 Test line of first log\n"}
{"log":"           Another line of first log\n"}
{"log":"2020-03-24 Test line of second log\n"}

And regex: Regex (?<log>^{"log":"\d{4}-\d{2}-\d{2}.*)

Will output:

[0] containers.var.log.containers.test.log: [1585071971.732015500,
    {"log"=>"{"log":"2020-03-24 Test line of first log
    \n           Another line of first log\n"}"}]
[1] containers.var.log.containers.test.log: [1585071975.000917200,
    {"log"=>"{"log":"2020-03-24 Test line of second log\n"}"}]

Fixes #1115

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • Documentation required for this feature

Documentation should be updated with information how parser is use to catch multiline docker log


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@jujugrrr
Copy link

Should we not have:

[0] containers.var.log.containers.test.log: [1585071971.732015500,
    {"log"=>"2020-03-24 Test line of first log
    \n           Another line of first log\n"}]
[1] containers.var.log.containers.test.log: [1585071975.000917200,
    {"log"=>"2020-03-24 Test line of second log\n"}]

instead of :

[0] containers.var.log.containers.test.log: [1585071971.732015500,
    {"log"=>"{"log":"2020-03-24 Test line of first log
    \n           Another line of first log\n"}"}]
[1] containers.var.log.containers.test.log: [1585071975.000917200,
    {"log"=>"{"log":"2020-03-24 Test line of second log\n"}"}]

I might miss something :-)

@sumo-drosiek
Copy link
Contributor Author

@jujugrrr I think this is correct behavior at least for this parser and regex. Should returns full docker log as "log".
Please keep in mind that input is a docker log file.
I skip timestamp and source (stdout/stderr) in example because this is not a point of this change.
I hope it answers to your doubts :)

This is output for code without changes:

[0] containers.var.log.containers.test.log: [1585153768.404923303, {"log"=>"{"log":"2020-03-24 Test line of first log\n"}"}]
[1] containers.var.log.containers.test.log: [1585153768.404930633, {"log"=>"{"log":"           Another line of first log\n"}"}]
[2] containers.var.log.containers.test.log: [1585153768.404932685, {"log"=>"{"log":"2020-03-24 Test line of second log\n"}"}]

@jujugrrr
Copy link

@jujugrrr I think this is correct behavior at least for this parser and regex. Should returns full docker log as "log".

My bad I think it's just the way you stdout. By the way, It's a great missing feature, we would love to see this implemented

@sumo-drosiek
Copy link
Contributor Author

sumo-drosiek commented Mar 26, 2020

Example connfiguration:
fluent-bit.conf

[SERVICE]
    Parsers_File parsers.conf
    Log_Level debug
[INPUT]
    Name             tail
    Path             /var/log/containers/*.log
    Tag              containers.*
    Refresh_Interval 1
    Rotate_Wait      60
    Mem_Buf_Limit    5MB
    Skip_Long_Lines  On
    DB               /tail-db/tail-containers-state.db
    DB.Sync          Normal
    Docker_Mode      On
    Docker_Mode_Parser multi_line

[Output]
    Name  stdout
    Match *

parsers.conf

[PARSER]
    Name multi_line
    Format regex
    Regex (?<log>^{"log":"\d{4}-\d{2}-\d{2}.*)

@sumo-drosiek
Copy link
Contributor Author

Debug log

Fluent Bit v1.3.11
Copyright (C) Treasure Data

[2020/03/26 09:51:58] [ info] Configuration:
[2020/03/26 09:51:58] [ info]  flush time     | 5.000000 seconds
[2020/03/26 09:51:58] [ info]  grace          | 5 seconds
[2020/03/26 09:51:58] [ info]  daemon         | 0
[2020/03/26 09:51:58] [ info] ___________
[2020/03/26 09:51:58] [ info]  inputs:
[2020/03/26 09:51:58] [ info]      tail
[2020/03/26 09:51:58] [ info] ___________
[2020/03/26 09:51:58] [ info]  filters:
[2020/03/26 09:51:58] [ info] ___________
[2020/03/26 09:51:58] [ info]  outputs:
[2020/03/26 09:51:58] [ info]      stdout.0
[2020/03/26 09:51:58] [ info] ___________
[2020/03/26 09:51:58] [ info]  collectors:
[2020/03/26 09:51:58] [debug] [storage] [cio stream] new stream registered: tail.0
[2020/03/26 09:51:58] [ info] [storage] version=1.0.3, initializing...
[2020/03/26 09:51:58] [ info] [storage] in-memory
[2020/03/26 09:51:58] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/03/26 09:51:58] [ info] [engine] started (pid=1)
[2020/03/26 09:51:58] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2020/03/26 09:51:58] [debug] [in_tail] inotify watch fd=20
[2020/03/26 09:51:58] [debug] [in_tail] scanning path /var/log/containers/*.log
[2020/03/26 09:51:58] [debug] [in_tail] add to scan queue /var/log/containers/test.log, offset=0
[2020/03/26 09:51:58] [debug] [router] match rule tail.0:stdout.0
[2020/03/26 09:51:58] [ info] [sp] stream processor started
[2020/03/26 09:51:58] [debug] [in_tail] file=/var/log/containers/test.log read=142 lines=3
[2020/03/26 09:51:58] [debug] [in_tail] file=/var/log/containers/test.log promote to TAIL_EVENT
[0] containers.var.log.containers.test.log: [1585216318.860048900, {"log"=>"{"log":"2020-03-24 Test line of first log\n           Another line of first log\n"}"}]
[1] containers.var.log.containers.test.log: [1585216322.001871600, {"log"=>"{"log":"2020-03-24 Test line of second log\n"}"}]
[2020/03/26 09:52:03] [debug] [task] created task=0x7f8c1164c820 id=0 OK
[2020/03/26 09:52:03] [debug] [task] destroy task=0x7f8c1164c820 (task_id=0)

@sumo-drosiek
Copy link
Contributor Author

Valgrind

root@b155d80f0a41:/tmp/src/build# rm /tail-db/tail-containers-state.db; valgrind --leak-check=full /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf
rm: cannot remove '/tail-db/tail-containers-state.db': No such file or directory
==119== Memcheck, a memory error detector
==119== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==119== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==119== Command: /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf
==119== 
Fluent Bit v1.4.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/03/26 13:09:14] [ info] Configuration:
[2020/03/26 13:09:14] [ info]  flush time     | 5.000000 seconds
[2020/03/26 13:09:14] [ info]  grace          | 5 seconds
[2020/03/26 13:09:14] [ info]  daemon         | 0
[2020/03/26 13:09:14] [ info] ___________
[2020/03/26 13:09:14] [ info]  inputs:
[2020/03/26 13:09:14] [ info]      tail
[2020/03/26 13:09:14] [ info] ___________
[2020/03/26 13:09:14] [ info]  filters:
[2020/03/26 13:09:14] [ info] ___________
[2020/03/26 13:09:14] [ info]  outputs:
[2020/03/26 13:09:14] [ info]      stdout.0
[2020/03/26 13:09:14] [ info] ___________
[2020/03/26 13:09:14] [ info]  collectors:
[2020/03/26 13:09:14] [debug] [storage] [cio stream] new stream registered: tail.0
[2020/03/26 13:09:14] [ info] [storage] version=1.0.3, initializing...
[2020/03/26 13:09:14] [ info] [storage] in-memory
[2020/03/26 13:09:14] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/03/26 13:09:14] [ info] [engine] started (pid=119)
[2020/03/26 13:09:15] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2020/03/26 13:09:15] [debug] [input:tail:tail.0] inotify watch fd=20
[2020/03/26 13:09:15] [debug] [input:tail:tail.0] scanning path /var/log/containers/*.log
[2020/03/26 13:09:15] [debug] [input:tail:tail.0] add to scan queue /var/log/containers/test.log, offset=0
[2020/03/26 13:09:15] [debug] [router] match rule tail.0:stdout.0
[2020/03/26 13:09:15] [ info] [sp] stream processor started
[2020/03/26 13:09:15] [debug] [input:tail:tail.0] file=/var/log/containers/test.log read=142 lines=3
[2020/03/26 13:09:15] [debug] [input:tail:tail.0] file=/var/log/containers/test.log promote to TAIL_EVENT
==119== Warning: client switching stacks?  SP change: 0x1fff0007c8 --> 0x5ee52d0
==119==          to suppress, use: --max-stackframe=137322673400 or greater
==119== Warning: client switching stacks?  SP change: 0x5ee5248 --> 0x1fff0007c8
==119==          to suppress, use: --max-stackframe=137322673536 or greater
==119== Warning: client switching stacks?  SP change: 0x1fff0007c8 --> 0x5ee5248
==119==          to suppress, use: --max-stackframe=137322673536 or greater
==119==          further instances of this message will not be shown.
[0] containers.var.log.containers.test.log: [1585228155.514162400, {"log"=>"{"log":"2020-03-24 Test line of first log\n           Another line of first log\n"}"}]
[1] containers.var.log.containers.test.log: [1585228159.012459900, {"log"=>"{"log":"2020-03-24 Test line of second log\n"}"}]
[2020/03/26 13:09:20] [debug] [task] created task=0x5edee90 id=0 OK
[2020/03/26 13:09:20] [debug] [task] destroy task=0x5edee90 (task_id=0)

Second report (with incresed stackframe)

root@b155d80f0a41:/tmp/src/build# rm /tail-db/tail-containers-state.db; valgrind --leak-check=full --max-stackframe=137322731304 /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf
==128== Memcheck, a memory error detector
==128== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==128== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==128== Command: /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf
==128== 
Fluent Bit v1.4.0
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/03/26 13:12:16] [ info] Configuration:
[2020/03/26 13:12:16] [ info]  flush time     | 5.000000 seconds
[2020/03/26 13:12:16] [ info]  grace          | 5 seconds
[2020/03/26 13:12:16] [ info]  daemon         | 0
[2020/03/26 13:12:16] [ info] ___________
[2020/03/26 13:12:16] [ info]  inputs:
[2020/03/26 13:12:16] [ info]      tail
[2020/03/26 13:12:16] [ info] ___________
[2020/03/26 13:12:16] [ info]  filters:
[2020/03/26 13:12:16] [ info] ___________
[2020/03/26 13:12:16] [ info]  outputs:
[2020/03/26 13:12:16] [ info]      stdout.0
[2020/03/26 13:12:16] [ info] ___________
[2020/03/26 13:12:16] [ info]  collectors:
[2020/03/26 13:12:16] [debug] [storage] [cio stream] new stream registered: tail.0
[2020/03/26 13:12:16] [ info] [storage] version=1.0.3, initializing...
[2020/03/26 13:12:16] [ info] [storage] in-memory
[2020/03/26 13:12:16] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/03/26 13:12:16] [ info] [engine] started (pid=128)
[2020/03/26 13:12:16] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2020/03/26 13:12:17] [debug] [input:tail:tail.0] inotify watch fd=20
[2020/03/26 13:12:17] [debug] [input:tail:tail.0] scanning path /var/log/containers/*.log
[2020/03/26 13:12:17] [debug] [input:tail:tail.0] add to scan queue /var/log/containers/test.log, offset=0
[2020/03/26 13:12:17] [debug] [router] match rule tail.0:stdout.0
[2020/03/26 13:12:17] [ info] [sp] stream processor started
[2020/03/26 13:12:17] [debug] [input:tail:tail.0] file=/var/log/containers/test.log read=142 lines=3
[2020/03/26 13:12:17] [debug] [input:tail:tail.0] file=/var/log/containers/test.log promote to TAIL_EVENT
==128== Warning: set address range perms: large range [0x5edf190, 0x1fff000748) (undefined)
==128== Use of uninitialised value of size 8
==128==    at 0x596D13: co_swap_function (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Use of uninitialised value of size 8
==128==    at 0x202699: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Warning: set address range perms: large range [0x5edf108, 0x1fff000748) (noaccess)
==128== Warning: set address range perms: large range [0x5edf108, 0x1fff000748) (undefined)
==128== Use of uninitialised value of size 8
==128==    at 0x596D13: co_swap_function (in /fluent-bit/bin/fluent-bit)
==128==    by 0x2026A3: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Use of uninitialised value of size 8
==128==    at 0x597ED5: co_switch (amd64.c:159)
==128==    by 0x5ED8F7F: ???
==128==    by 0x5DD255F: ???
==128==    by 0x5E3F69F: ???
==128==    by 0x5DD98EF: ???
==128==    by 0x5DE50AF: ???
==128==    by 0x25FFFFFFFF: ???
==128==    by 0x5ED8EAF: ???
==128==    by 0xA4: ???
==128==    by 0x5E505DF: ???
==128==    by 0x5EDF25F: ???
==128== 
==128== Use of uninitialised value of size 8
==128==    at 0x2026A4: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Use of uninitialised value of size 8
==128==    at 0x2026A8: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Use of uninitialised value of size 8
==128==    at 0x26CD00: cb_stdout_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x2026CB: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Conditional jump or move depends on uninitialised value(s)
==128==    at 0x26C700: flb_malloc (in /fluent-bit/bin/fluent-bit)
==128==    by 0x26CDA0: cb_stdout_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x2026CB: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 
==128== Conditional jump or move depends on uninitialised value(s)
==128==    at 0x483572C: malloc (vg_replace_malloc.c:299)
==128==    by 0x26C714: flb_malloc (in /fluent-bit/bin/fluent-bit)
==128==    by 0x26CDA0: cb_stdout_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x2026CB: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
==128== 

Memcheck: the 'impossible' happened:
   create_MC_Chunk: shadow area is accessible

host stacktrace:
==128==    at 0x580480A4: show_sched_status_wrk (m_libcassert.c:369)
==128==    by 0x580481B7: report_and_quit (m_libcassert.c:440)
==128==    by 0x5804846B: panic (m_libcassert.c:516)
==128==    by 0x5804846B: vgPlain_tool_panic (m_libcassert.c:531)
==128==    by 0x58005057: create_MC_Chunk (mc_malloc_wrappers.c:219)
==128==    by 0x58005B1F: vgMemCheck_new_block (mc_malloc_wrappers.c:388)
==128==    by 0x58005B1F: vgMemCheck_malloc (mc_malloc_wrappers.c:407)
==128==    by 0x580A7883: do_client_request (scheduler.c:1925)
==128==    by 0x580A7883: vgPlain_scheduler (scheduler.c:1488)
==128==    by 0x580F5D80: thread_wrapper (syswrap-linux.c:103)
==128==    by 0x580F5D80: run_a_thread_NORETURN (syswrap-linux.c:156)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 128)
==128==    at 0x483577F: malloc (vg_replace_malloc.c:299)
==128==    by 0x26C714: flb_malloc (in /fluent-bit/bin/fluent-bit)
==128==    by 0x26CDA0: cb_stdout_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x2026CB: output_pre_cb_flush (in /fluent-bit/bin/fluent-bit)
==128==    by 0x597CE6: co_init (amd64.c:117)
client stack range: ??????? client SP: 0x5EDF080
valgrind stack range: [0x10032BF000 0x10033BEFFF] top usage: 18568 of 1048576

Thread 2: status = VgTs_WaitSys syscall 232 (lwpid 129)
==128==    at 0x51B57EF: epoll_wait (epoll_wait.c:30)
==128==    by 0x59573B: _mk_event_wait (mk_event_epoll.c:345)
==128==    by 0x595A2B: mk_event_wait (mk_event.c:163)
==128==    by 0x1F096F: log_worker_collector (in /fluent-bit/bin/fluent-bit)
==128==    by 0x2078CC: step_callback (in /fluent-bit/bin/fluent-bit)
==128==    by 0x4850FA2: start_thread (pthread_create.c:486)
==128==    by 0x51B54CE: clone (clone.S:95)
client stack range: [0x6A01000 0x71F8FFF] client SP: 0x71F84F0
valgrind stack range: [0x1005D5E000 0x1005E5DFFF] top usage: 2904 of 1048576


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.

I didn't notice any more than before my changes

@sumo-drosiek
Copy link
Contributor Author

@edsiper Could you take a look at this PR, please.

I added new parser to avoid confusion and to keep backward compability as much as possible
One thing which I dont like is that last line of log could be flush with some delay (under 5 s). I have not idea how to fix it, but I believe this is not an issue regarding expectations of community (#1115).

@sumo-drosiek
Copy link
Contributor Author

We noticed mem-leak. Please do not merge

@sumo-drosiek sumo-drosiek changed the title in_tail: Support multiline functionality (not filter) in dockermode [WIP] in_tail: Support multiline functionality (not filter) in dockermode Apr 10, 2020
@sumo-drosiek sumo-drosiek changed the title [WIP] in_tail: Support multiline functionality (not filter) in dockermode in_tail: Support multiline functionality (not filter) in dockermode Apr 16, 2020
@AXington
Copy link

AXington commented May 6, 2020

It'd be really amazing to get this merged and released. I have developers yelling at me almost daily for this and they think I'm blowing them off when I tell them this isn't supported yet.

It's getting to the point of them asking us to abandon Fluent Bit.

@sumo-drosiek
Copy link
Contributor Author

Memory leak is fixed, so code is ready to review and merge

@AXington I hope it will be review soon. I encourage to test if the fix meet the expectations :)

Copy link
Member

@edsiper edsiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. I did a quick review and added some comments.

In addition to the changes, would you please include some runtime tests for this feature ? (tests/runtime)

struct flb_time out_time = {0};

if (flb_sds_len(file->dmode_lastline) > 0 && file->dmode_complete) {
#ifdef FLB_HAVE_REGEX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this macro conditional should cover the if statement, otherwise the if does nothing.

flb_sds_len_set(file->dmode_lastline, 0);

/* concatenate current log line with buffered one */
file->dmode_buf = flb_sds_cat(file->dmode_buf, val, val_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a flb_sds_t tmp variable to check the return value of the flb_sds_*() calls. If one of them returns NULL due to problems allocating memory it needs to return right away to avoid unexpected behaviors in the bottom lines.

@edsiper
Copy link
Member

edsiper commented May 19, 2020

reviewed, comments above.

@edsiper edsiper self-assigned this May 19, 2020
@edsiper edsiper added the waiting-for-user Waiting for more information, tests or requested changes label May 19, 2020
@sumo-drosiek
Copy link
Contributor Author

@edsiper Thanks for review. I already fixed that, but I got SIGSEGV for few ci tests (FLB_OPT="-DFLB_SMALL=On" twice and FLB_OPT="-DSANITIZE_ADDRESS=On" once) and I couldn't spot the reason.

Would be grateful for quick look and comment if I should be afraid of that. I can back to the investigation early next week

@sumo-drosiek
Copy link
Contributor Author

I fixed the runtime tests for in_tail and I believe the PR is ready to merge

@sumo-drosiek sumo-drosiek requested a review from edsiper May 29, 2020 13:20
@frankreno
Copy link

@edsiper - Any other items to fix in this PR before merging? Any update on when we can expect this fix?

@git-pchauhan
Copy link

git-pchauhan commented Jul 8, 2020

As many others mentioned on this thread, we are also currently blocked on this. Anything you can do to expedite this would be highly appreciated!
@edsiper
@sumo-drosiek

@cmedley2
Copy link

cmedley2 commented Jul 8, 2020

Thanks everyone for your efforts on this. We are really looking forward to this functionality with our sumo collector chart.

@cp-jennifer
Copy link

We are waiting for this fix as well.

@mmshaikh88
Copy link

Kindly expedite this change. It would be much appreciated!
@edsiper
@sumo-drosiek
We and many others are blocked due to this.

@jeremyhermann
Copy link

jeremyhermann commented Jul 8, 2020 via email

@edsiper
Copy link
Member

edsiper commented Jul 9, 2020

thanks, this PR looks good for v1.5 release.

Before to merge it, would you please squash the commits as this:

  • in_tail: ...all about in_tail changes.... # only changes applicable to plugins/in_tail
  • tests: runtime: in_tail: add tail tests for docker mode # only changes applicable to tests/runtime/in_tail.c

to make it part of the release, pls do the adjustment ASAP

@PettitWesley
Copy link
Contributor

(Adding in a comment since I want this feature to get out in 1.5... have been waiting for it for a while...)

pls do the adjustment ASAP

@sumo-drosiek Just to be clear- he means like in the next 24 - 48 hours.

Dominik Rosiek added 2 commits July 10, 2020 08:40
Add Docker_Mode_Parser and use it to verify if line is valid
first line while usin dockermode.

For example input:
```
{"log":"2020-03-24 Test line of first log\n"}
{"log":"           Another line of first log\n"}
{"log":"2020-03-24 Test line of second log\n"}

```

And regex: `Regex (?<log>^{"log":"\d{4}-\d{2}-\d{2}.*)`

Will output:
```
[0] containers.var.log.containers.test.log: [1585071971.732015500,
    {"log"=>"{"log":"2020-03-24 Test line of first log
    \n           Another line of first log\n"}"}]
[1] containers.var.log.containers.test.log: [1585071975.000917200,
    {"log"=>"{"log":"2020-03-24 Test line of second log\n"}"}]
```

Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
@sumo-drosiek
Copy link
Contributor Author

@edsiper @PettitWesley Done 🚀

@edsiper edsiper merged commit d4c0afb into fluent:master Jul 10, 2020
@edsiper
Copy link
Member

edsiper commented Jul 10, 2020

thanks!

@sibidass
Copy link

sibidass commented Jul 13, 2020

@sumo-drosiek I was testing out new Docker_Mode_Parser. It is not detecting Time_Key that is captured in regex. Is this expected behaviour or am I missing something?

PS: I already checked my regex with sample data in rubular, and it is capturing the 2 fields properly.

My parser config:

[INPUT]
Name tail
Path /var/log/containers/load*.log
Exclude_Path /var/log/containers/fluent*.log
Buffer_Max_Size 5M
Tag kubernetes.*
Refresh_Interval 10
Mem_Buf_Limit 64M
DB /var/log/flb_kube.dbs
DB.Sync NORMAL
Ignore_Older 2d
Docker_Mode_Parser docker_multiline
Docker_Mode On
Rotate_Wait 60

[PARSER]
Name docker_multiline
Format regex
Regex (?^{"log":"\d{4}-\d{2}-\d{2}."time":"(?.)".*)
Time_Key testtime
Time_Format %Y-%m-%dT%H:%M:%S.%LZ

stdout:

[0] kubernetes.var.log.containers.loadtestfluent1-2667j_addon-oil-ns_loggingapp-526b62744ad55bbb46cf0ece1b636397311b9393dfd1763179b9a22b64ebb1ef.log: [1594664190.567404747, {"log"=>"{"log":"2020-07-13 18:15:53 +0000:loadtest:INFO:load_run_uuid=flbrun-pr-test1.1.3_302 zblnoucglj\nzblnoucglj\nzblnoucglj line=2924s\n","stream":"stdout","time":"2020-07-13T18:15:53.39885287Z"}"}]

@sumo-drosiek
Copy link
Contributor Author

@sibidass This is expected. Docker_Mode_Parser is being use only to check if current line is beginning of the docker multiline log.

You can try Parser option in addition to the Docker_Mode_Parser.

@sateesh3550
Copy link

sateesh3550 commented Jul 14, 2020

@sumo-drosiek After testing Docker_Mode_Parser and updating the image to 1.5.0 i see its supporting multiline supporting but we see kubernetes labels are missing to the log output. We are using the same configuration when we are using 1.3.1 and old(here we see kubernetes labels are part of the output).

[INPUT]
Name tail
Tag test
Path /var/log/containers/test.log
Multiline_Flush 5
Multiline On
Parser_Firstline multi_line_logs
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10

[FILTER]
Name kubernetes
Match *
Kube_URL https://kubernetes.default.svc.cluster.local:443
Kube_Tag_Prefix kube.var.log.containers.
Annotations On
Merge_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On

[PARSER]
Name multi_line_logs
Format regex
Regex ^(?[^\s]+:) (?.+)$

@sibidass
Copy link

sibidass commented Jul 14, 2020

my bad.. I thought, like in Multiline option, I will not be able to use Parser along Docker_Mode_Parser. however, the solution is working great.. kudos to all the great work @sumo-drosiek!! much appreciated! . finally I will be able to get rid of my lua code base, which is processing multilines right now in our log pipeline, which was kind of slowing us down. I am sure this will give a lot of performance boost. 🥳 🚀

@sibidass
Copy link

@sumo-drosiek After testing Docker_Mode_Parser and updating the image to 1.5.0 i see its supporting multiline supporting but we see kubernetes labels are missing to the log output. We are using the same configuration when we are using 1.3.1 and old(here we see kubernetes labels are part of the output).
Kube_Tag_Prefix kube.var.log.containers.

@sateesh3550 kubernetes metadata enrichment is working fine for me. Are you sure you have not tagged it wrongly in tail? Tag test in your tail configuration and Kube_Tag_Prefix kube.var.log.containers. in kubernetes filter config.

@sateesh3550
Copy link

@sibidass i'm sure i'm tagging it correctly. The exact configuration is working fine, when i'm using 1.3.1

@sumo-drosiek
Copy link
Contributor Author

@sateesh3550

@sumo-drosiek After testing Docker_Mode_Parser and updating the image to 1.5.0 i see its supporting multiline supporting but we see kubernetes labels are missing to the log output. We are using the same configuration when we are using 1.3.1 and old(here we see kubernetes labels are part of the output).

Does the issue occurs for both Docker_Mode_Parser used and not used?
Could you check if the configuration works for 1.4.6 correctly?

@sateesh3550
Copy link

@sumo-drosiek
Using the the latest image (1.5.0), with or without Docker_Mode_parser, i see the kubernetes labels are missing.
When i'm using 1.4.6 with Multiline On, still i see kubernetes labels are missing. And for 1.4.6 it doesn't support Docker_Mode_Parser(error as below).
tail: unknown configuration property 'Docker_Mode_Parser'

@sumo-drosiek
Copy link
Contributor Author

@sateesh3550
I wanted to ensure that the PR introduced the regression, but as it is not working for the previous version, I suppose it's not related. Am I right?

@sateesh3550
Copy link

sateesh3550 commented Jul 14, 2020

@sumo-drosiek your right, but docker_mode_parser fixed the multiline functionality to parse it in a right way. But upgrading the image from 1.3.1 to 1.4.6 or 1.5.0 to use this feature called docker_mode_parser broke the functionality of kubernetes labels.

@sumo-drosiek
Copy link
Contributor Author

@sateesh3550 So, I believe you should create separate issue for tracking it. You can try to figure out what is the first non working image and it will help a lot to find out the breaking change, but anyway it shouldn't be tracked here.

@logicflakes
Copy link

I am trying to use Docker_Mode_Parser to parse multiline json logs, but two or more logs are being concatenated in the es output.
Am I missing something?
Rubular Link with Logs and regex used in Docker_Mode_Parser : https://rubular.com/r/Xl7RwshJWtdRhp

input-kubernetes.conf
    [INPUT]
    Name                tail
    Tag                 kube.*
    Parser              docker
    Path                /var/log/containers/*.log
    DB                  /var/log/flb_kube.db
    Rotate_Wait         60
    Mem_Buf_Limit       5MB
    Refresh_Interval    1
    Docker_Mode         On
    Docker_Mode_Parser  docker_multiline
parsers.conf
[PARSER]
    Name        docker_multiline
    Format      regex
    Regex       (?<log>^{"datetime":"\d{4}-\d{2}-\d{2}.*)

[PARSER]
    Name        docker
    Format      json
    Time_Key    datetime
    Time_Format %Y-%m-%dT%H:%M:%S.%L%z
    Time_Keep   On

@sumo-drosiek
Copy link
Contributor Author

@logicflakes Looks like you are trying to match log from the es and not from the docker itself. Is it the content of the /var/log/containers/*.log?

@logicflakes
Copy link

logicflakes commented Aug 14, 2020

@sumo-drosiek

Looks like you are trying to match log from the es and not from the docker itself

didn't get you

Is it the content of the /var/log/containers/*.log?

yes

more details on issue #2471

@sumo-drosiek
Copy link
Contributor Author

sumo-drosiek commented Aug 14, 2020

@logicflakes

Looks like you are trying to match log from the es and not from the docker itself

Sorry, I thought the logs are from es. My point was that logs from kubectl logs/docker logs can differ from the /var/log/containers/*.log. The default docker logs format is json with 3 keys: time, log and stream. for example:

{"log":"Hello from Docker!\n","stream":"stdout","time":"2020-01-09T22:51:31.549390877Z"}
{"log":"This message shows that your installation appears to be working correctly.\n","stream":"stdout","time":"2020-01-09T22:51:31.549396749Z"}

but docker/kubectl logs will show:

Hello from Docker!
This message shows that your installation appears to be working correctly.

@logicflakes
Copy link

@sumo-drosiek
Ah! thanks, so I copied the output from kubectl logs, updating the regex (?<log>^{"log":"{\\"datetime\\":\\"\d{4}-\d{2}-\d{2}.*) to match the correct format worked.

@sumo-drosiek
Copy link
Contributor Author

You're welcome. I suppose I need to add documentation for that feature 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-for-user Waiting for more information, tests or requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docker_mode to recombine multiline records in json-log from docker