-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Bug Report
Describe the bug
In it's simplest form, using more than one space to separate the "Exclude" key and the regex value appears to be causing it not to match.
Specifically, using the filter:
[FILTER]
Name grep
Match test_input
With either of the following exclude lines:
Exclude log .*?\s+DEBUG\s+[\s\S]+
Exclude log /.*?\s+DEBUG\s+[\s\S]+/
Does not match a log line that it should. However, just removing the additional spaces between the Exclude Key, Key, and Regex:
Exclude log .*?\s+DEBUG\s+[\s\S]+
Exclude log /.*?\s+DEBUG\s+[\s\S]+/
Works as expected.
The documentation page does show more than one space used in multiple locations which would seem to indicate there is a documentation issue or a bug.
To Reproduce
- Input File: testInput.log
- Example log message:
2024-01-01 13:14:15,161 DEBUG [NOTTHREAD:testing[SessionID:TESTING:20240101131415167:-1]] FLUENTBIT TEST 1
The combined version of the configuration file that was used to verify the issue (after removing all extra spaces between Keys and values):
[SERVICE]
Log_Level trace
Log_File /tmp/fluentbit_debug.log
[INPUT]
Name tail
Path /tmp/testInput.log
Path_Key filePath
Tag test_input
Read_from_Head true
Skip_Empty_Lines On
Skip_Long_Lines On
Buffer_Chunk_Size 256KB
Buffer_Max_Size 5MB
Mem_Buf_Limit 25MB
[FILTER]
Name multiline
Match test_input
multiline.key_content log
multiline.parser multiline_test
# Exclude Debug Level
[FILTER]
Name grep
Match test_input
# 1. Works...
Exclude log .*?\s+DEBUG\s+[\s\S]+
# 2. Does not work... Nothing Excluded
#Exclude log .*?\s+DEBUG\s+[\s\S]+
# 3. Works...
#Exclude log /.*?\s+DEBUG\s+[\s\S]+/
# 4. Does not work... Nothing Excluded
#Exclude log /.*?\s+DEBUG\s+[\s\S]+/
# Additional Tests quoting the value just in case...
# 5. Does not work... Nothing Excluded
#Exclude log ".*?\s+DEBUG\s+[\s\S]+"
# 6. Does not work... Nothing Excluded
#Exclude log ".*?\s+DEBUG\s+[\s\S]+"
# 5. Does not work... Nothing Excluded
#Exclude log "/.*?\s+DEBUG\s+[\s\S]+/"
# 6. Does not work... Nothing Excluded
#Exclude log "/.*?\s+DEBUG\s+[\s\S]+/"
# Parse Out Structured Data
[FILTER]
Name parser
Match test_input
key_name log
Parser structured_data_parser
Reserve_Data true
# Add Attributes
[FILTER]
Name record_modifier
Match test_input
Record logtype test
Record env QA
Record platform test_but_interact
Record purpose test
Record role none
[OUTPUT]
Name file
Match *
File /tmp/testOutput.log
With a parser configuration file defined as:
[MULTILINE_PARSER]
name multiline_test
type regex
flush_timeout 1000
# rules | state name | regex pattern | next state
# ------|---------------|---------------------------------------------|------------
rule "start_state" "/(^\s*(?:(?:[\d-]+) (?:[\d:,]+))[\s\S]*)/" "cont"
rule "cont" "/(^\s*(?!(?:[\d-]+) (?:[\d:,]+))[\s\S]*)/" "cont"
[PARSER]
Name structured_data_parser
Format regex
Regex /^\s*(?<logtimestamp>(?:[\d-]+) (?:[\d:,]+))\s+(?<level>\S+)\s+(?:\[(?<thread>[^\[\]]+)(?:\[SessionID:(?<sessionid>(?:N\/A|(?<customerNumber>[^:]+))[^\[\]]+)\])?\]\s+)?(?:(?<classname>(?:com|org|net)\.\S+)\s+-\s+)?(?<message>[\s\S]+)/m
Time_Key logtimestamp
# 2023-10-25 10:01:09,722
Time_Format %Y-%m-%d %H:%M:%S,%L
- Steps to reproduce the problem:
- Comment / Uncomment the appropriate Exclude line
- Run using testInput.log as the input.
- Failing versions of the Exclude line will result in all 20 lines ending up in the output. See: testOutput_fail.log
- Successful versions will result in only the two non-DEBUG lines in the output file. See: testOutput_success.log
Expected behavior
Based on the documentation, regardless of the number of spaces between the "Exclude" Key, Key, and Regex, We should see any line containing "DEBUG" with at least one space on either side excluded.
Screenshots
N/A
Your Environment
- Version used: 2.0.8
- Configuration: See Above
- Environment name and version (e.g. Kubernetes? What version?): New Relic Infrastructure Agent - 1.47.2
- Operating System and version: Linux - RHEL 7
- Filters and plugins: Grep Filter Plugin
Additional context
Making a very long story short, this issue was discovered as part of getting multiline parsing working with the New Relic Infrastructure Agent. This is my first time working with Fluentbit in any form so I was assuming it was my fault until I discovered removing the additional spaces fixed the issue. Again, I'm not sure if this is just a documentation issue due to the need to maintain space sensitivity for the regex or if this is a bug and additional spaces should be trimmed. I'm hoping this is fixed and/or this report helps others with the same issue.