revert: revert collector version - CORE-352#3734
Merged
Merged
Conversation
blumamir
approved these changes
Nov 3, 2025
blumamir
pushed a commit
to blumamir/odigos
that referenced
this pull request
Nov 3, 2025
After upgrading the collector to v0.138.0, we started observing frequent OOM situations that did not occur before the version bump. We are continuing to investigate the root cause, but in the meantime, this change has been reverted. <!-- Describe the tests you ran and how you verified your changes. --> - [ ] Added Unit Tests - [ ] Updated e2e Tests - [X] Manual Testing - [X] Manual Load Test <!-- If this PR affects how Odigos interacts with Kubernetes, check the relevant boxes below and provide more details --> - [ ] Changes how Odigos interacts with Kubernetes - [ ] Introduces additional calls to the API Server (potential performance impact) - [ ] New Query/feature supported in all the k8s versions supported by Odigos - [ ] Modifies Odigos manifests (addressed in both CLI and Helm) - [ ] Changes RBAC permissions <!-- Any changes that users will notice or need to be aware of --> - [ ] Users need to take action before upgrading - [ ] Automatic migration will modify existing objects (backward compatible) - [ ] Changes UI, CLI, or K8s Manifests aspects in a way that users need to be aware of - [ ] Documentation updated accordingly
blumamir
pushed a commit
to blumamir/odigos
that referenced
this pull request
Nov 3, 2025
After upgrading the collector to v0.138.0, we started observing frequent OOM situations that did not occur before the version bump. We are continuing to investigate the root cause, but in the meantime, this change has been reverted. <!-- Describe the tests you ran and how you verified your changes. --> - [ ] Added Unit Tests - [ ] Updated e2e Tests - [X] Manual Testing - [X] Manual Load Test <!-- If this PR affects how Odigos interacts with Kubernetes, check the relevant boxes below and provide more details --> - [ ] Changes how Odigos interacts with Kubernetes - [ ] Introduces additional calls to the API Server (potential performance impact) - [ ] New Query/feature supported in all the k8s versions supported by Odigos - [ ] Modifies Odigos manifests (addressed in both CLI and Helm) - [ ] Changes RBAC permissions <!-- Any changes that users will notice or need to be aware of --> - [ ] Users need to take action before upgrading - [ ] Automatic migration will modify existing objects (backward compatible) - [ ] Changes UI, CLI, or K8s Manifests aspects in a way that users need to be aware of - [ ] Documentation updated accordingly
blumamir
pushed a commit
to blumamir/odigos
that referenced
this pull request
Nov 3, 2025
## Description After upgrading the collector to v0.138.0, we started observing frequent OOM situations that did not occur before the version bump. We are continuing to investigate the root cause, but in the meantime, this change has been reverted. ## How Has This Been Tested? <!-- Describe the tests you ran and how you verified your changes. --> - [ ] Added Unit Tests - [ ] Updated e2e Tests - [X] Manual Testing - [X] Manual Load Test ## Kubernetes Checklist <!-- If this PR affects how Odigos interacts with Kubernetes, check the relevant boxes below and provide more details --> - [ ] Changes how Odigos interacts with Kubernetes - [ ] Introduces additional calls to the API Server (potential performance impact) - [ ] New Query/feature supported in all the k8s versions supported by Odigos - [ ] Modifies Odigos manifests (addressed in both CLI and Helm) - [ ] Changes RBAC permissions ## User Facing Changes <!-- Any changes that users will notice or need to be aware of --> - [ ] Users need to take action before upgrading - [ ] Automatic migration will modify existing objects (backward compatible) - [ ] Changes UI, CLI, or K8s Manifests aspects in a way that users need to be aware of - [ ] Documentation updated accordingly
13 tasks
damemi
added a commit
that referenced
this pull request
Feb 4, 2026
…or/otel to 141 + Remove deprecated components + Bump k8s min version to 1.21 (#4111) The clickhouse exporter supports TLS settings similar to otlp: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/945e5a71ef31793ff3280b28c1425086ea5332b6/exporter/clickhouseexporter/README.md#tls Some users need this to connect to clickhouse, adding them as options in the destination here This adds: * `insecure_skip_verify` * `ca_file` (using the k8sconfig interface to mount the secret as a file, similar to how the GCP exporter supports application default credentials) The direct string fields (such as CAPem, CertPem, KeyPem) aren't yet supported in the clickhouse exporter, so it has to be a mounted file. See open-telemetry/opentelemetry-collector-contrib#43911 (comment) --- To do this, it required bumping the collector/otel deps to 136 when TLS config support was added to clickhouse. This required the following changes: This actually needs collector v0.136.0 for these settings from open-telemetry/opentelemetry-collector-contrib#42581 (open-telemetry/opentelemetry-collector-contrib@d9769f7) Also needs to remove loki exporter (removed in 131) for 136 🙃 open-telemetry/opentelemetry-collector-contrib#41413, see https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.130.0/exporter/lokiexporter#deprecation-notice it's replaced with just otlp. The only destination that actually looks like it's using the loki exporter is OpsVerse As well as the opencensus exporter, removed in 133 upstream by open-telemetry/opentelemetry-collector-contrib#42239 Also routing processor open-telemetry/opentelemetry-collector-contrib#36616 See previous attempt #3669 (reverted in #3734) --- Then, it turns out that 136 was bugged and did not have full support for TLS settings like `insecure_skip_verify`. This was fixed in 141, which required the following extra changes: Actually needs collector v141 due to this bug in clickhouse not handling all tls settings: open-telemetry/opentelemetry-collector-contrib#43911 fixed in open-telemetry/opentelemetry-collector-contrib#44093 Remove deprecated carbon exporter support (unmaintained upstream) open-telemetry/opentelemetry-collector-contrib#44532 another upstream breaking change giving go mod trouble open-telemetry/opentelemetry-collector#13948 configgrpc update: open-telemetry/opentelemetry-collector#13996 and now metadata.yaml metrics require stablity levels open-telemetry/opentelemetry-collector#13756 ``` Error: failed loading /app/collector/receivers/odigosebpfreceiver/metadata.yaml: decoding failed due to the following error(s): 'telemetry.metrics[ebpf_memory_pressure_wait_time_total]' missing required field: `stability` 'telemetry.metrics[ebpf_total_bytes_read]' missing required field: `stability` 'telemetry.metrics[ebpf_lost_samples]' missing required field: `stability` Error: failed loading /app/collector/receivers/odigosebpfreceiver/metadata.yaml: decoding failed due to the following error(s): 'telemetry.metrics[ebpf_memory_pressure_wait_time_total]' missing required field: `stability` 'telemetry.metrics[ebpf_total_bytes_read]' missing required field: `stability` 'telemetry.metrics[ebpf_lost_samples]' missing required field: `stability` Error: metadata.yaml ordering check failed: [telemetry metrics] keys are not sorted: [odigos_log_data_size odigos_metric_data_size odigos_trace_data_size odigos_accepted_spans odigos_accepted_metric_points odigos_accepted_log_records] Error: metadata.yaml ordering check failed: [telemetry metrics] keys are not sorted: [odigos_log_data_size odigos_metric_data_size odigos_trace_data_size odigos_accepted_spans odigos_accepted_metric_points odigos_accepted_log_records] ``` This bump also required adding the `endpointslices` permission to the odiglet service account for the data-collection collector --- Finally, endpointslices was not GA in k8s 1.20. This PR bumps our minimum supported k8s version to 1.21. Enterprise update in odigos-io/odigos-enterprise#2117
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
After upgrading the collector to v0.138.0, we started observing frequent OOM situations that did not occur before the version bump. We are continuing to investigate the root cause, but in the meantime, this change has been reverted.
How Has This Been Tested?
Kubernetes Checklist
User Facing Changes