-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New component: Kubernetes api logs receiver #24641
Comments
Hi @yotamloe, this sounds interesting but I am wondering if you have tried running the collector as a sidecar with your application container? Are there any specific challenges with the collector when run in sidecar mode other than the additional config a developer might need to add to the pod specs? |
Question: instead of specifying a reload interval, can you implement this using a watcher? |
@jinja2 Thank you for reading the proposal this is great feedback 😁 I didn't find any receiver that is capable of collecting pod logs without reaching the log files in the underlying node (
|
@jpkrohling Yes, I think it's possible. |
@yotamloe kube-apiserver fall over and become unresponsive when cluster is too large and too many requests are sent to it. I think that we can adds an option(in daemonset mode) to send the request to kubelet /pods endpoint instead of kube-apiserver to retrieve the log if possible. Since Kubelet is running locally in nodes, the request would be responded faster and each node would only get one request one time. This could save kube-apiserver power to handle other requests. In this way, the Kube-apiserver bottleneck should be avoided when the cluster is large. |
Thanks, @JaredTan95 This is great feedback. Your proposition sounds interesting and could indeed reduce the pressure on the kube-apiserver, especially in larger clusters. My main concern due to the fact we are dealing with serverless k8s environments is that some vendors do not support daemonsets. for example this is stated in eks fargete docs:
And since kubelet runs locally on the fargate nodes it could be hard to ensure we have communication with all of the kubelets without using deamonset (or a sidecar for all containers). I would be happy to hear your thoughts. |
@yotamloe You can use the filelog receiver but the app needs to write to file. The general practice is to set up an emptydir mounted by both the app and otel container. This might require application changes to get it to log to file and rotation, etc. To reduce the config overhead, I recommend looking into auto-injecting the otel-collector sidecar with the Opentelemetry Operator, if the distro allows installing mutating webhooks. Here's a simple example you can build on.
Example application pod spec which will use an annotation to indicate the which collector config to inject.
Re: the proposed receiver, imho it might not be a sustainable solution for clusters running at any real production scale. But looks like a good addition for smaller development clusters. A few things to consider for the receiver -
|
I believe there's still a need to at least explore having a component that grabs logs from the API server, as it would remove the requirement to run the collector as a daemonset. Requiring daemonsets is a no-go in some setups and is especially problematic for multi-tenant settings. |
I would be curious if the k8sobjectsreceiver could be made to do it. |
@TylerHelmuth Are you asking if the object watching feature of the k8sobject receiver can be leveraged here? This receiver will list + watch for k8s pods but will use a different api to get/follow the logs. kubelet will be reading the logs from files local to the nodes and won't involve etcd outside of tracking the pods for which to collect the logs. |
Thanks, everybody for all of your feedback it's super helpful! I understand that there is interest in the community to explore a component that collects logs from the Kubernetes API server.
Draft configuration (any suggestions from the community are welcome): receivers:
kube_api_logs:
namespaces:
- default
- dev
- prod
filters:
- container_names:
- server-*
- pod_labels:
app: my-app
- pod_names:
- server-*
operators:
- type: json_parser
timestamp:
parse_from: attributes.time
layout: '%Y-%m-%d %H:%M:%S'
daemonset_mode: true I will Start working on a PR for it (if there are no objections), any help from the community will be amazing! |
There is another initiative to introduce a receiver with similar functionality #24439. I believe we should consolidate the efforts and have one receiver for both use cases |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping |
This issue has been closed as inactive because it has been stale for 120 days with no activity. |
The purpose and use-cases of the new component
The problem:
Logging is a challenge in serverless Kubernetes frameworks (like AWS EKS Fargate, AKS virtual nodes, GCP autopilot, etc...) due to the lack of direct access to the underlying nodes. This limitation means that traditional log collection methods, which often rely on reading log files directly from the nodes, are ineffective.
Example use cases:
Elasticsearch
,Firehose
,Kinesis Firehose
,CloudWatch
, andCloudWatch Logs
. If the developer's backend system doesn't align with any of these outputs, they face challenges in log collection.Describe the solution you'd like:
Proposing a new receiver to gather logs directly from the Kubernetes API, bypassing the need for node-level access. This receiver will help developers working with serverless Kubernetes environments who need access to workload logs for troubleshooting and monitoring.
Describe alternatives you've considered:
Today Some vendors have specific solutions like EKS Fargate Fluentbit log router, but they are limited and do not support a wide range of backends out of the box. AKS virtual nodes and GCP autopilot does not have a simple solution for log forwarding. I think the goal of the new component is to make serverless kubernetes log collection vendor-agnostic.
Example configuration for the component
Draft configuration (any suggestions from the community are welcome):
The suggested draft supports filtering logs according to namespaces, pod labels and pod names, and also supports
stanza
operators for log parsing.Telemetry data types supported
logs
Code Owner(s)
No response
Sponsor (optional)
No response
Additional context
I'm curious if any other community members face these challenges and what solutions they use to overcome them.
I will be happy for any feedback on this suggestion.
I would be happy to contribute to this feature and open a PR for it.
The text was updated successfully, but these errors were encountered: