diff --git a/administration/configuring-fluent-bit/multiline-parsing.md b/administration/configuring-fluent-bit/multiline-parsing.md index 1a4ccdf96..41de454b4 100644 --- a/administration/configuring-fluent-bit/multiline-parsing.md +++ b/administration/configuring-fluent-bit/multiline-parsing.md @@ -19,11 +19,12 @@ Fluent Bit exposes certain pre-configured parsers (built-in) to solve specific m | Parser | Description | | ------ | ----------- | -| `docker` | Process a log entry generated by a Docker container engine. This parser supports the concatenation of large log entries split by Docker. If you use this parser, and you also want to concatenate loglines like stacktraces, you can add the [multiline filter](../../pipeline/filters/multiline-stacktrace.md) to specify additional parsers. | | `cri` | Process a log entry generated by CRI-O container engine. Like the `docker` parser, it supports concatenation of log entries. | -| `go` | Process log entries generated by a Go based language application and perform concatenation if multiline messages are detected. | -| `python` | Process log entries generated by a Python based language application and perform concatenation if multiline messages are detected. | +| `docker` | Process a log entry generated by a Docker container engine. This parser supports the concatenation of large log entries split by Docker. If you use this parser, and you also want to concatenate log lines like stack traces, you can add the [multiline filter](../../pipeline/filters/multiline-stacktrace.md) to specify additional parsers. | +| `go` | Process log entries generated by a Go-based language application and perform concatenation if multiline messages are detected. | | `java` | Process log entries generated by a Google Cloud Java language application and perform concatenation if multiline messages are detected. | +| `python` | Process log entries generated by a Python-based language application and perform concatenation if multiline messages are detected. | +| `ruby` | Process log entries generated by a Ruby-based language application and perform concatenation if multiline messages are detected. | ### Configurable multiline parsers @@ -35,12 +36,14 @@ To understand which multiline parser type is required for your use case you have | Property | Description | Default | | -------- | ----------- | ------- | +| `flush_timeout` | Timeout in milliseconds to flush a non-terminated multiline buffer. | `4s` | +| `key_content` | For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated. | _none_ | +| `match_string` | String to match against for `endswith` or `equal` types. Not used for `regex` type. | _none_ | | `name` | Specify a unique name for the multiline parser definition. A good practice is to prefix the name with the word `multiline_` to avoid confusion with normal parser definitions. | _none_ | -| `type` | Set the multiline mode. Fluent Bit supports the type `regex`.| _none_ | +| `negate` | Negate the pattern matching result. When set to `true`, a non-matching line is treated as matching. | `false` | | `parser` | Name of a pre-defined parser that must be applied to the incoming content before applying the regular expression rule. If no parser is defined, it's assumed that's a raw text and not a structured message. When a parser is applied to a raw text, the regular expression is applied against a specific key of the structured message by using the `key_content` configuration property. | _none_ | -| `key_content` | For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated. | _none_ | -| `flush_timeout` | Timeout in milliseconds to flush a non-terminated multiline buffer. | `5s` | -| `rule` | Configure a rule to match a multiline pattern. The rule has a [specific format](#rules-definition). Multiple rules can be defined. | _none_| +| `rule` | Configure a rule to match a multiline pattern. The rule has a [specific format](#rules-definition). Multiple rules can be defined. Only used with `regex` type. | _none_| +| `type` | Set the multiline mode. Fluent Bit supports `regex`, `endswith`, and `equal` (or `eq`). | _none_ | #### Lines and states @@ -261,11 +264,11 @@ Fluent Bit now supports a configuration key to **limit the memory used during mu | Property | Description | Default | | ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | -| `multiline_buffer_limit` | Sets the maximum size of the in-memory buffer used while assembling a multiline message. When the accumulated size exceeds this limit, the message is truncated and a `multiline_truncated: true` metadata field is attached to the emitted record. A value of `0` disables the limit. Accepts unit suffixes like `KiB`, `MiB`, or `GiB`. | `2MiB` | +| `multiline_buffer_limit` | Sets the maximum size of the in-memory buffer used while assembling a multiline message. When the accumulated size exceeds this limit, the message is truncated and a `multiline_truncated: true` metadata field is attached to the emitted record. A value of `0` disables the limit. Accepts unit suffixes such as `KB`, `MB`, `GB`, `KiB`, `MiB`, or `GiB` (all interpreted as binary units). | `2MB` | -```text +```yaml service: - multiline_buffer_limit 2MiB # default; limit concatenated multiline message size + multiline_buffer_limit: 2MB # default; limit concatenated multiline message size ``` If the limit is reached, Fluent Bit flushes the partial record with `multiline_truncated: true` metadata immediately to prevent Out-Of-Memory (OOM) conditions. @@ -442,4 +445,395 @@ $ ./fluent-bit --config fluent-bit.conf "}] [2] tail.0: [[1750333602.460998000, {}], {"log"=>"another line... "}] -``` \ No newline at end of file +``` + +## Built-in parser examples + +The following examples show how to use each built-in multiline parser. + +### Cri + +The `cri` parser handles logs from CRI-O container runtime. It uses the `_p` field to determine if a line is complete (`F`) or partial (`P`). + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: tail + path: /var/log/containers/*.log + read_from_head: true + multiline.parser: cri + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Flush 1 + Log_Level info + +[INPUT] + Name tail + Path /var/log/containers/*.log + Read_From_Head true + Multiline.Parser cri + +[OUTPUT] + Name stdout + Match * +``` + +{% endtab %} +{% tab title="Input log" %} + +```text +2024-01-15T10:30:45.123456789Z stdout F Complete log message +2024-01-15T10:30:46.123456789Z stderr P This is a partial +2024-01-15T10:30:46.123456790Z stderr P message that spans +2024-01-15T10:30:46.123456791Z stderr F multiple lines +``` + +{% endtab %} +{% tab title="Output" %} + +```text +[0] tail.0: {"log"=>"Complete log message", "stream"=>"stdout"} +[1] tail.0: {"log"=>"This is a partial message that spans multiple lines", "stream"=>"stderr"} +``` + +{% endtab %} +{% endtabs %} + +### Docker + +The `docker` parser handles Docker JSON logs. Lines ending with `\n` are complete; lines without are partial and concatenated. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: tail + path: /var/lib/docker/containers/*/*.log + read_from_head: true + multiline.parser: docker + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Flush 1 + Log_Level info + +[INPUT] + Name tail + Path /var/lib/docker/containers/*/*.log + Read_From_Head true + Multiline.Parser docker + +[OUTPUT] + Name stdout + Match * +``` + +{% endtab %} +{% tab title="Input log" %} + +```json +{"log":"This is a complete message\n","stream":"stdout","time":"2024-01-15T10:30:45.123456789Z"} +{"log":"This is a partial ","stream":"stdout","time":"2024-01-15T10:30:46.123456789Z"} +{"log":"message that spans ","stream":"stdout","time":"2024-01-15T10:30:46.123456790Z"} +{"log":"multiple lines\n","stream":"stdout","time":"2024-01-15T10:30:46.123456791Z"} +``` + +{% endtab %} +{% tab title="Output" %} + +```text +[0] tail.0: {"log"=>"This is a complete message\n", "stream"=>"stdout"} +[1] tail.0: {"log"=>"This is a partial message that spans multiple lines\n", "stream"=>"stdout"} +``` + +{% endtab %} +{% endtabs %} + +### Go + +The `go` parser handles Go panic stack traces. It detects `panic:` messages and captures the full goroutine stack. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: tail + path: /var/log/myapp/*.log + read_from_head: true + multiline.parser: go + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Flush 1 + Log_Level info + +[INPUT] + Name tail + Path /var/log/myapp/*.log + Read_From_Head true + Multiline.Parser go + +[OUTPUT] + Name stdout + Match * +``` + +{% endtab %} +{% tab title="Input log" %} + +```text +panic: runtime error: invalid memory address or nil pointer dereference +[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x123456] + +goroutine 1 [running]: +main.main() + /app/main.go:15 +0x26 +``` + +{% endtab %} +{% tab title="Output" %} + +```text +[0] tail.0: {"log"=>"panic: runtime error: invalid memory address or nil pointer dereference\n[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x123456]\n\ngoroutine 1 [running]:\nmain.main()\n\t/app/main.go:15 +0x26\n"} +``` + +{% endtab %} +{% endtabs %} + +### Java + +The `java` parser handles Java exception stack traces. It detects `Exception`, `Error`, and `Throwable` patterns with their stack frames. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: tail + path: /var/log/myapp/*.log + read_from_head: true + multiline.parser: java + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Flush 1 + Log_Level info + +[INPUT] + Name tail + Path /var/log/myapp/*.log + Read_From_Head true + Multiline.Parser java + +[OUTPUT] + Name stdout + Match * +``` + +{% endtab %} +{% tab title="Input log" %} + +```text +java.lang.NullPointerException: Something went wrong + at com.example.MyClass.myMethod(MyClass.java:42) + at com.example.MyClass.anotherMethod(MyClass.java:30) + at com.example.Main.main(Main.java:10) +Caused by: java.lang.IllegalArgumentException: Invalid input + at com.example.Helper.validate(Helper.java:15) + ... 3 more +``` + +{% endtab %} +{% tab title="Output" %} + +```text +[0] tail.0: {"log"=>"java.lang.NullPointerException: Something went wrong\n\tat com.example.MyClass.myMethod(MyClass.java:42)\n\tat com.example.MyClass.anotherMethod(MyClass.java:30)\n\tat com.example.Main.main(Main.java:10)\nCaused by: java.lang.IllegalArgumentException: Invalid input\n\tat com.example.Helper.validate(Helper.java:15)\n\t... 3 more\n"} +``` + +{% endtab %} +{% endtabs %} + +### Python + +The `python` parser handles Python tracebacks. It detects `Traceback (most recent call last):` and captures the full stack. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: tail + path: /var/log/myapp/*.log + read_from_head: true + multiline.parser: python + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Flush 1 + Log_Level info + +[INPUT] + Name tail + Path /var/log/myapp/*.log + Read_From_Head true + Multiline.Parser python + +[OUTPUT] + Name stdout + Match * +``` + +{% endtab %} +{% tab title="Input log" %} + +```text +Traceback (most recent call last): + File "/app/main.py", line 10, in + result = process_data(None) + File "/app/utils.py", line 25, in process_data + return data.strip() +AttributeError: 'NoneType' object has no attribute 'strip' +``` + +{% endtab %} +{% tab title="Output" %} + +```text +[0] tail.0: {"log"=>"Traceback (most recent call last):\n File \"/app/main.py\", line 10, in \n result = process_data(None)\n File \"/app/utils.py\", line 25, in process_data\n return data.strip()\nAttributeError: 'NoneType' object has no attribute 'strip'\n"} +``` + +{% endtab %} +{% endtabs %} + +### Ruby + +The `ruby` parser handles Ruby exception back traces. It detects patterns like `file.rb:line:in 'method'` and continuation lines starting with `from`. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: tail + path: /var/log/myapp/*.log + read_from_head: true + multiline.parser: ruby + + outputs: + - name: stdout + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Flush 1 + Log_Level info + +[INPUT] + Name tail + Path /var/log/myapp/*.log + Read_From_Head true + Multiline.Parser ruby + +[OUTPUT] + Name stdout + Match * +``` + +{% endtab %} +{% tab title="Input log" %} + +```text +app/models/user.rb:42:in `validate_email' + from app/models/user.rb:30:in `save' + from app/controllers/users_controller.rb:15:in `create' + from config/routes.rb:5:in `block in
' +``` + +{% endtab %} +{% tab title="Output" %} + +```text +[0] tail.0: {"log"=>"app/models/user.rb:42:in `validate_email'\n\tfrom app/models/user.rb:30:in `save'\n\tfrom app/controllers/users_controller.rb:15:in `create'\n\tfrom config/routes.rb:5:in `block in
'\n"} +``` + +{% endtab %} +{% endtabs %} \ No newline at end of file