-
Notifications
You must be signed in to change notification settings - Fork 5k
Log that files are too small to ingest at warn level #44751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
leehinman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add the input id to the log message (maybe it is a field added to the logger?). If we had that, the user would know which input needs to be changed.
Right now, you have to switch to debug mode, and then you just get the file name, which means you have to work backwards from the globs to determine which input needs to be modified.
Thanks Lee, that was a very good point. I added it on 76ccd18, but I don't like having to use the global logger in so many tests 😞. All the "testing loggers" I found in I'll put this PR in draft and write a noop logger for |
leehinman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
One optional suggestion.
|
@Mergifyio backport 8.17 8.18 8.19 9.0 |
✅ Backports have been createdDetails
|
(cherry picked from commit b91d891) # Conflicts: # NOTICE.txt # go.mod # go.sum
(cherry picked from commit b91d891) # Conflicts: # NOTICE.txt # filebeat/input/filestream/prospector_creator.go # go.mod # go.sum
(cherry picked from commit b91d891)
(cherry picked from commit b91d891) # Conflicts: # NOTICE.txt # go.mod # go.sum
… level (#44811) (cherry picked from commit b91d891) # Conflicts: # NOTICE.txt # go.mod # go.sum --------- Co-authored-by: Tiago Queiroz <[email protected]>
(cherry picked from commit b91d891) Co-authored-by: Tiago Queiroz <[email protected]>
…n level (#44808) # Conflicts: # NOTICE.txt # filebeat/input/filestream/prospector_creator.go # go.mod # go.sum --------- Co-authored-by: Tiago Queiroz <[email protected]>
…n level (#44809) # Conflicts: # NOTICE.txt # go.mod # go.sum --------- Co-authored-by: Tiago Queiroz <[email protected]>
Proposed commit message
Filestream now logs one line at warn level per scan with the number of files that are too small to be ingested.
Checklist
I have commented my code, particularly in hard-to-understand areasI have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.## Disruptive User Impact## Author's ChecklistHow to test this PR locally
Create a few small files
Build & start Filebeat with the following configuration:
Look for the following logs
{ "log.level": "debug", "@timestamp": "2025-06-12T13:41:17.034-0400", "log.logger": "scanner", "log.origin": { "function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*fileScanner).GetFiles", "file.name": "filestream/fswatch.go", "file.line": 398 }, "message": "cannot start ingesting from file \"/tmp/small-1.log\": filesize of \"/tmp/small-1.log\" is 4 bytes, expected at least 1024 bytes for fingerprinting: file size is too small for ingestion", "service.name": "filebeat", "filestream_id": "log-small-files-cannot-be-ingested", "ecs.version": "1.6.0" } { "log.level": "debug", "@timestamp": "2025-06-12T13:41:17.034-0400", "log.logger": "scanner", "log.origin": { "function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*fileScanner).GetFiles", "file.name": "filestream/fswatch.go", "file.line": 398 }, "message": "cannot start ingesting from file \"/tmp/small-2.log\": filesize of \"/tmp/small-2.log\" is 4 bytes, expected at least 1024 bytes for fingerprinting: file size is too small for ingestion", "service.name": "filebeat", "filestream_id": "log-small-files-cannot-be-ingested", "ecs.version": "1.6.0" } { "log.level": "debug", "@timestamp": "2025-06-12T13:41:17.034-0400", "log.logger": "scanner", "log.origin": { "function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*fileScanner).GetFiles", "file.name": "filestream/fswatch.go", "file.line": 398 }, "message": "cannot start ingesting from file \"/tmp/small-3.log\": filesize of \"/tmp/small-3.log\" is 8 bytes, expected at least 1024 bytes for fingerprinting: file size is too small for ingestion", "service.name": "filebeat", "filestream_id": "log-small-files-cannot-be-ingested", "ecs.version": "1.6.0" } { "log.level": "warn", "@timestamp": "2025-06-12T13:41:17.034-0400", "log.logger": "scanner", "log.origin": { "function": "github.com/elastic/beats/v7/filebeat/input/filestream.(*fileScanner).GetFiles", "file.name": "filestream/fswatch.go", "file.line": 421 }, "message": "3 files are too small to be ingested, files need to be at least 1024 in size for ingestion to start. To change this behaviour set 'prospector.scanner.fingerprint.length' and 'prospector.scanner.fingerprint.offset'. Enable debug logging to see all file names.", "service.name": "filebeat", "filestream_id": "log-small-files-cannot-be-ingested", "ecs.version": "1.6.0" }They will repeat every scan of the file system (default is 10s)
Related issues
## Use cases## Screenshots## Logs