You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our environment is located in a private Kubernetes cluster, for monitoring of the logs it is used fluentd.
The fluentd configuration reads from the log files generated by kubernetes in var/logs/containers/
There are different microservices, and for each of one a different regex is applied to extract the information to fields,
adds the information of kubernetes with @type kubernetes_metadata. Finally, these logs are written in two ELK and in a
local file. In case of failure there is a secundary with which to insert the failed ones.
One of the microservices has a high traffic and generates a large amount of logs.
Every certain time, it is observed that traces are lost both in the ELKs and in the files
It is suspected that it has to do with the fast rotation of the log files. At some moment, a rotated file is finished and,
instead of continuing with the next rotated file, it continues with the current log file.
To Reproduce
Fast rotation speed in logs file
Expected behavior
Increase the speed of log processing so that this offset with the rotation of them does not occur.
- In case that offset does occur (temporarily), although it inserts behind, it inserts all traces.
Your Environment
fluent/fluentd-kubernetes-daemonset:v1.16.2-debian-elasticsearch7-1.0, as the main image.
Although the following images have also been tested:
- v1.16.2-debian-elasticsearch7-1.0
- v1.16.2-debian-elasticsearch7-1.1
- v1.16.2-debian-elasticsearch8-1.0
- v1.16.2-debian-elasticsearch8-1.1
- v1.16.3-debian-elasticsearch8-1.0
- v1.16.3-debian-elasticsearch8-2.0
- v1.16.3-debian-elasticsearch8-2.1
- v1.16.5-debian-elasticsearch7-1.3
- v1.16.5-debian-elasticsearch8-1.3
- v1.17-debian-elasticsearch8-1
Only insert in the local file
Only insert in the ELKs
Manage the log file and its rotation manually
Modification of the fluentd configuration, source, buffer in the output.
Observations:
When there is a large amount of logs, it is observed that the insertion in ELK is getting behind,
until it stops writing and writes those from a few minutes later.
The files rotate every few minutes (3-4 minutes).
In one of the tests performed, it was perceived that in the file of positions there were two entries for the same file identifier.
One of the new files generated had been created with the same identifier as one that had been deleted (but was maintained
in the file of positions)
The text was updated successfully, but these errors were encountered:
@slopezxrd I've experienced a similar issue that you can read about fluent/fluentd#4693. I have temporarily worked around the issue by increasing the maximum log file size (the default is 10MB) on the kubelet. Increasing the size will reduce the frequency logs are rotated. I suspect the issue is related to inode tracking when logs are rotated quickly and often. I've noticed that when when logs go missing there's usually a warning in the fluentd logs that says Skip update_watcher because watcher has been already updated by other inotify event. I've tried using different types of buffers, e.g. file, hybrid, and memory; increasing the memory and CPU, using multiple workers, etc. None of them seem to have an impact. At significantly high volumes (>1000 log lines per second per container) fluentd has trouble reading and tracking when logs are rotated (I have had up to 10 of these containers running on 1 instance).
Describe the bug
To Reproduce
Fast rotation speed in logs file
Expected behavior
Increase the speed of log processing so that this offset with the rotation of them does not occur.
- In case that offset does occur (temporarily), although it inserts behind, it inserts all traces.
Your Environment
Your Configuration
Your Error Log
Additional context
Tests performed, without success:
Observations:
When there is a large amount of logs, it is observed that the insertion in ELK is getting behind,
until it stops writing and writes those from a few minutes later.
The files rotate every few minutes (3-4 minutes).
In one of the tests performed, it was perceived that in the file of positions there were two entries for the same file identifier.
One of the new files generated had been created with the same identifier as one that had been deleted (but was maintained
in the file of positions)
The text was updated successfully, but these errors were encountered: