-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup states from registrar when the files are removed #41747
base: main
Are you sure you want to change the base?
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
Lint failed on a previously existing issue, unrelated to the change |
pushed a commit to fix previously existing linting failure |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
@belimawr could you please have a look here? |
This PR doesn't clean the backlog of existing registry entries, but it does prevent them from piling up as they are now removed when the input is closed |
I'll take a look at it today. |
This pull request is now in conflicts. Could you fix it? 🙏
|
3dfd9ab
to
92564d9
Compare
@rsafonseca, thanks for the PR. The It's replacement is the Have you tried it? Is there any reason why it wouldn't work for you? |
Hi @belimawr, I'm actually using the container input which is not deprecated, since this is the default and recommended for containers, which is a wrapper around the log input. beats/filebeat/input/container/input.go Line 74 in 24d7cf0
The container input does have all the clean_ options listed in the docs, but they don't work because they are not all implemented in the log input https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-container.html Oddly, it seems that it does have a config property for stream, but since it's using the log input it's probably doing nothing, so maybe someone just forgot to finish this last bit? 😹 So, either we could merge this to fix the problem with the log input, regardless of it being deprecated, or we should get the container input actually migrated to Filestream input |
Tanks for the quick reply @rsafonseca ! Indeed the container input using the log input has been a challenge for us, we're trying to migrate it to Filestream, ensuring safe and correct state migration on a highly dynamic environment like Kubernets does pose a challenge. What we've been recommending is for users to migrate to the Filestream input with the container parser + fingerprint for the file identity. If you're using the input directly, then you can configure it like this: filebeat.inputs:
- type: filestream
id: kubernetes-container-logs
paths:
- /var/log/containers/*.log
parsers:
- container: ~
prospector:
scanner:
fingerprint.enabled: true
symlinks: true
file_identity.fingerprint: ~
processors:
- add_kubernetes_metadata:
host: '${NODE_NAME}'
matchers:
- logs_path:
logs_path: /var/log/containers/ Or if you want to use autodiscover: filebeat.autodiscover:
providers:
- type: kubernetes
node: '${NODE_NAME}'
hints.enabled: true
hints.default_config:
type: filestream
id: >-
kubernetes-container-logs-${data.kubernetes.pod.name}-${data.kubernetes.container.id}
paths:
- '/var/log/containers/*-${data.kubernetes.container.id}.log'
parsers:
- container: ~
prospector:
scanner:
fingerprint.enabled: true
symlinks: true
file_identity.fingerprint: ~ Those two examples are from our default Kubernetes manifest The only downside from this migration is that at first there will be a little of duplication as the Filestream input will ingest any already-existing file from the beginning, however as Kubernetes is a highly dynamic environment and the container log files usually rotate quickly, usually the data duplication is small. You can also take a look at Would that work for you? |
Proposed commit message
Cleanup states from registrar when the files are removed and no longer related to any input
Fixes unbounded registry file growth and increased memory usage on long running machines with a lot of input churn
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues
Use cases
Ensures registry file doesn't keep growing forever which causes additional disk space usage, as well increased mem/cpu usage. The fact that the registry never gets cleaned up in a number of scenarios like running on a Kubernetes cluster, constitues a slow memory leak (memory usage increases over time with long running nodes, especially when there is a lot of pod/input churn)
Screenshots