Set TA and OpAMP status version when upgrading#4378
Conversation
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
Signed-off-by: Pavol Loffay <p.loffay@gmail.com>
swiatekm
left a comment
There was a problem hiding this comment.
Is this right? It looks like you're unconditionally setting this version whenever we reconcile, not just during upgrades.
As far as I see, there is not upgrade procedure for TA. There is just a reconciliation which upgrades TA image (and other pieces). I cannot reason why setting the version always would lead to issues. |
Should we always set it to the default set at build time? I think we should only do that if that default wasn't changed via a flag, or the user hadn't used a custom image in the CR. |
| changed := params.TargetAllocator.DeepCopy() | ||
| changed.Status.Version = version.TargetAllocator() | ||
|
|
||
| if changed.Status.Version == "" { |
There was a problem hiding this comment.
@swiatekm if the custom image is used the operator sets the version from the versions.txt
There was a problem hiding this comment.
but after the upgrade the version was not updated
There was a problem hiding this comment.
I think if a user uses a custom image they should make sure it is compatible with the version that the operator expects. The expectation is that an operator version works well with a specific component version.
There was a problem hiding this comment.
Sure, but they could easily have their own build of it, with a different versioning scheme, that is nonetheless compatible. So I'd take the container image tag if the image isn't the default one. @jaronoff97 @frzifus WDYT?
There was a problem hiding this comment.
By default, the OpenTelemetry Operator ensures consistent versioning between itself and the managed OpenTelemetryCollector resources. That is, if the OpenTelemetry Operator is based on version 0.40.0, it will create resources with an underlying OpenTelemetry Collector at version 0.40.0.
When a custom Spec.Image is used with an OpenTelemetryCollector resource, the OpenTelemetry Operator will not manage this versioning and upgrading. In this scenario, it is best practice that the OpenTelemetry Operator version should match the underlying core version. Given a OpenTelemetryCollector resource with a Spec.Image configured to a custom image based on underlying OpenTelemetry Collector at version 0.40.0, it is recommended that the OpenTelemetry Operator is kept at version 0.40.0.
This PR upgrades the otelcollector to the latest version available for the opentelemetry-collector and opentelemetry-operator. It was automatically generated by the GitHub Actions workflow. The summary of the OSS changelog is below: # Prometheusreceiver Changes ## v0.136.0 to v0.142.0 Generated on: 2026-01-11 07:06:49 --- ### v0.142.0 - [**BREAKING**] `receiver/prometheus`: Promote the receiver.prometheusreceiver.RemoveStartTimeAdjustment feature gate to stable and remove in-receiver metric start time adjustment in favor of the metricstarttime processor, including disabling the created-metric feature gate. ([#44180](open-telemetry/opentelemetry-collector-contrib#44180)) Previously, users could disable the RemoveStartTimeAdjustment feature gate to temporarily keep the legacy start time adjustment behavior in the Prometheus receiver. With this promotion to stable and bounded registration, that gate can no longer be disabled; the receiver will no longer set StartTime on metrics based on process_start_time_seconds, and users should migrate to the metricstarttime processor for equivalent functionality. This change also disables the receiver.prometheusreceiver.UseCreatedMetric feature gate, which previously used the `<metric>_created` series to derive start timestamps for counters, summaries, and histograms when scraping non OpenMetrics protocols. However, this does not mean that the `_created` series is always ignored: when using the OpenMetrics 1.0 protocol, Prometheus itself continues to interpret the `_created` series as the start timestamp, so only the receiver-side handling for other scrape protocols has been removed. - [**BREAKING**] `receiver/prometheus`: Native histogram scraping and ingestion is now controlled by the scrape configuration option `scrape_native_histograms`. ([#44861](open-telemetry/opentelemetry-collector-contrib#44861)) The feature gate `receiver.prometheusreceiver.EnableNativeHistograms` is now stable and enabled by default. Native histograms scraped from Prometheus will automatically be converted to OpenTelemetry exponential histograms. To enable scraping of native histograms, you must configure `scrape_native_histograms: true` in your Prometheus scrape configuration (either globally or per-job). Additionally, the protobuf scrape protocol must be enabled by setting `scrape_protocols` to include `PrometheusProto`. - [**BREAKING**] `receiver/prometheusremotewrite`: Updated to Remote Write 2.0 spec rc.4, requiring Prometheus 3.8.0 or later ([#44861](open-telemetry/opentelemetry-collector-contrib#44861)) The upstream Prometheus library updated the Remote Write 2.0 protocol from rc.3 to rc.4 in prometheus/prometheus[#17411](open-telemetry/opentelemetry-collector-contrib#17411). This renamed `CreatedTimestamp` to `StartTimestamp` and moved it from the `TimeSeries` message to individual `Sample` and `Histogram` messages. This is a wire-protocol incompatibility, so Prometheus versions 3.7.x and earlier will no longer work correctly with this receiver. Please upgrade to Prometheus 3.8.0 or later. - [**OTHER**] `receiver/prometheus`: Deprecate `use_start_time_metric` and `start_time_metric_regex` config in favor of the processor `metricstarttime` ([#44180](open-telemetry/opentelemetry-collector-contrib#44180)) - [**FEATURE**] `receiver/prometheusremotewrite`: Map.PutStr causes excessive memory allocations due to repeated slice expansions ([#44612](open-telemetry/opentelemetry-collector-contrib#44612)) - [**BUG FIX**] `receiver/prometheus`: Fix HTTP response body leak in target allocator when fetching scrape configs fails ([#44921](open-telemetry/opentelemetry-collector-contrib#44921)) The getScrapeConfigsResponse function did not close resp.Body on error paths. If io.ReadAll or yaml.Unmarshal failed, the response body would leak, potentially causing HTTP connection exhaustion. - [**BUG FIX**] `receiver/prometheus`: Fixes yaml marshaling of prometheus/common/config.Secret types ([#44445](open-telemetry/opentelemetry-collector-contrib#44445)) ### v0.141.0 - [**FEATURE**] `receiver/prometheus`: Add feature gate for extra scrape metrics in Prometheus receiver ([#44181](open-telemetry/opentelemetry-collector-contrib#44181)) deprecation of extra scrape metrics in Prometheus receiver will be removed eventually. - [**FEATURE**] `receiver/prometheus`: Support JWT Profile for Authorization Grant (RFC 7523 3.1) ([#44381](open-telemetry/opentelemetry-collector-contrib#44381)) ### v0.140.0 - [**BREAKING**] `receiver/prometheus`: The prometheus receiver no longer adjusts the start time of metrics by default. ([#43656](open-telemetry/opentelemetry-collector-contrib#43656)) Disable the receiver.prometheusreceiver.RemoveStartTimeAdjustment | feature gate to temporarily re-enable this functionality. Users that need | this functionality should migrate to the metricstarttime processor, | and use the true_reset strategy for equivalent behavior. - [**FEATURE**] `receiver/prometheusremotewrite`: Skip emitting empty metrics. ([#44149](open-telemetry/opentelemetry-collector-contrib#44149)) - [**FEATURE**] `receiver/prometheusremotewrite`: prometheusremotewrite receiver now accepts metric type unspcified histograms. ([#41840](open-telemetry/opentelemetry-collector-contrib#41840)) ### v0.139.0 - [**BUG FIX**] `receiver/prometheus`: Fix missing staleness tracking leading to missing no recorded value data points. ([#43893](open-telemetry/opentelemetry-collector-contrib#43893)) - [**BUG FIX**] `receiver/prometheusremotewrite`: Fixed a concurrency bug in the Prometheus remote write receiver where concurrent requests with identical job/instance labels would return empty responses after the first successful request. ([#42159](open-telemetry/opentelemetry-collector-contrib#42159)) ### v0.138.0 - [**FEATURE**] `receiver/prometheus`: added NHCB(native histogram wit custom buckets) to explicit histogram conversion ([#41131](open-telemetry/opentelemetry-collector-contrib#41131)) ## Summary | Category | Count | |----------|-------| | Breaking Changes | 4 | | Features | 6 | | Bug Fixes | 4 | | Other Changes | 1 | | **Total** | **15** | # Target-allocator Changes ## v0.136.0 to v0.142.0 Generated on: 2026-01-11 07:07:05 --- ### 0.142.0 - [**FEATURE**] `target allocator`: Add support for prometheus scrape classes ([#3600](open-telemetry/opentelemetry-operator#3600)) Added support for configuring `scrapeClasses` when using the PrometheusCR-feature of the target allocator. The format of the `scrapeClasses` array is exactly as same as `spec.scrapeClasses` of the `Prometheus` CRD. - [**BUG FIX**] `target allocator`: Fix CA certificate race condition with client cert renewals by extending its duration and and renewal attempt. ([#4441](open-telemetry/opentelemetry-operator#4441)) The CA certificate now has a 2-year duration (instead of the default 90 days) to prevent race conditions where client and server certificates could be signed by different CA versions during simultaneous renewal. This ensures the CA remains stable while dependent certificates renew regularly. ### 0.141.0 - [**FEATURE**] `target allocator`: make evaluation_interval configurable for Prometheus CR watcher ([#4520](open-telemetry/opentelemetry-operator#4520)) ### 0.140.0 - [**BUG FIX**] `github action`: Remove unused VERSION and VERSION_DATE environment variables from publish workflows ([#4470](open-telemetry/opentelemetry-operator#4470)) Removed the unused "Read version" step that set VERSION and VERSION_DATE environment variables in both publish-target-allocator.yaml and publish-operator-opamp-bridge.yaml workflows. These variables were never referenced anywhere in the workflows. ### 0.138.0 - [**BREAKING**] `target allocator`: Remove the operator.collector.targetallocatorcr feature flag ([#2422](open-telemetry/opentelemetry-operator#2422)) This behavior has been enabled by default since version 0.127.0. - [**BUG FIX**] `target allocator`: Add missing TA ownership watches to cert-manager Certificate and Issuer ([#4368](open-telemetry/opentelemetry-operator#4368)) ### 0.137.0 - [**BREAKING**] `target allocator`: Promote the operator.collector.targetallocatorcr feature flag to Stable ([#2422](open-telemetry/opentelemetry-operator#2422)) The flag can no longer be disabled. It will be completely removed in 0.138.0. - [**BUG FIX**] `target allocator, opamp`: Fix version not being updated after version upgrade. ([#4378](open-telemetry/opentelemetry-operator#4378)) - [**BUG FIX**] `target-allocator`: Fixed potential duplicate scrape targets caused by Prometheus relabeling. ([#3617](open-telemetry/opentelemetry-operator#3617)) ## Summary | Category | Count | |----------|-------| | Breaking Changes | 2 | | Features | 2 | | Bug Fixes | 5 | | Other Changes | 0 | | **Total** | **9** | --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Rashmi Chandrashekar <rashmy@microsoft.com>
This PR upgrades the otelcollector to the latest version available for the opentelemetry-collector and opentelemetry-operator. It was automatically generated by the GitHub Actions workflow. The summary of the OSS changelog is below: # Prometheusreceiver Changes ## v0.136.0 to v0.142.0 Generated on: 2026-01-11 07:06:49 --- ### v0.142.0 - [**BREAKING**] `receiver/prometheus`: Promote the receiver.prometheusreceiver.RemoveStartTimeAdjustment feature gate to stable and remove in-receiver metric start time adjustment in favor of the metricstarttime processor, including disabling the created-metric feature gate. ([#44180](open-telemetry/opentelemetry-collector-contrib#44180)) Previously, users could disable the RemoveStartTimeAdjustment feature gate to temporarily keep the legacy start time adjustment behavior in the Prometheus receiver. With this promotion to stable and bounded registration, that gate can no longer be disabled; the receiver will no longer set StartTime on metrics based on process_start_time_seconds, and users should migrate to the metricstarttime processor for equivalent functionality. This change also disables the receiver.prometheusreceiver.UseCreatedMetric feature gate, which previously used the `<metric>_created` series to derive start timestamps for counters, summaries, and histograms when scraping non OpenMetrics protocols. However, this does not mean that the `_created` series is always ignored: when using the OpenMetrics 1.0 protocol, Prometheus itself continues to interpret the `_created` series as the start timestamp, so only the receiver-side handling for other scrape protocols has been removed. - [**BREAKING**] `receiver/prometheus`: Native histogram scraping and ingestion is now controlled by the scrape configuration option `scrape_native_histograms`. ([#44861](open-telemetry/opentelemetry-collector-contrib#44861)) The feature gate `receiver.prometheusreceiver.EnableNativeHistograms` is now stable and enabled by default. Native histograms scraped from Prometheus will automatically be converted to OpenTelemetry exponential histograms. To enable scraping of native histograms, you must configure `scrape_native_histograms: true` in your Prometheus scrape configuration (either globally or per-job). Additionally, the protobuf scrape protocol must be enabled by setting `scrape_protocols` to include `PrometheusProto`. - [**BREAKING**] `receiver/prometheusremotewrite`: Updated to Remote Write 2.0 spec rc.4, requiring Prometheus 3.8.0 or later ([#44861](open-telemetry/opentelemetry-collector-contrib#44861)) The upstream Prometheus library updated the Remote Write 2.0 protocol from rc.3 to rc.4 in prometheus/prometheus[#17411](open-telemetry/opentelemetry-collector-contrib#17411). This renamed `CreatedTimestamp` to `StartTimestamp` and moved it from the `TimeSeries` message to individual `Sample` and `Histogram` messages. This is a wire-protocol incompatibility, so Prometheus versions 3.7.x and earlier will no longer work correctly with this receiver. Please upgrade to Prometheus 3.8.0 or later. - [**OTHER**] `receiver/prometheus`: Deprecate `use_start_time_metric` and `start_time_metric_regex` config in favor of the processor `metricstarttime` ([#44180](open-telemetry/opentelemetry-collector-contrib#44180)) - [**FEATURE**] `receiver/prometheusremotewrite`: Map.PutStr causes excessive memory allocations due to repeated slice expansions ([#44612](open-telemetry/opentelemetry-collector-contrib#44612)) - [**BUG FIX**] `receiver/prometheus`: Fix HTTP response body leak in target allocator when fetching scrape configs fails ([#44921](open-telemetry/opentelemetry-collector-contrib#44921)) The getScrapeConfigsResponse function did not close resp.Body on error paths. If io.ReadAll or yaml.Unmarshal failed, the response body would leak, potentially causing HTTP connection exhaustion. - [**BUG FIX**] `receiver/prometheus`: Fixes yaml marshaling of prometheus/common/config.Secret types ([#44445](open-telemetry/opentelemetry-collector-contrib#44445)) ### v0.141.0 - [**FEATURE**] `receiver/prometheus`: Add feature gate for extra scrape metrics in Prometheus receiver ([#44181](open-telemetry/opentelemetry-collector-contrib#44181)) deprecation of extra scrape metrics in Prometheus receiver will be removed eventually. - [**FEATURE**] `receiver/prometheus`: Support JWT Profile for Authorization Grant (RFC 7523 3.1) ([#44381](open-telemetry/opentelemetry-collector-contrib#44381)) ### v0.140.0 - [**BREAKING**] `receiver/prometheus`: The prometheus receiver no longer adjusts the start time of metrics by default. ([#43656](open-telemetry/opentelemetry-collector-contrib#43656)) Disable the receiver.prometheusreceiver.RemoveStartTimeAdjustment | feature gate to temporarily re-enable this functionality. Users that need | this functionality should migrate to the metricstarttime processor, | and use the true_reset strategy for equivalent behavior. - [**FEATURE**] `receiver/prometheusremotewrite`: Skip emitting empty metrics. ([#44149](open-telemetry/opentelemetry-collector-contrib#44149)) - [**FEATURE**] `receiver/prometheusremotewrite`: prometheusremotewrite receiver now accepts metric type unspcified histograms. ([#41840](open-telemetry/opentelemetry-collector-contrib#41840)) ### v0.139.0 - [**BUG FIX**] `receiver/prometheus`: Fix missing staleness tracking leading to missing no recorded value data points. ([#43893](open-telemetry/opentelemetry-collector-contrib#43893)) - [**BUG FIX**] `receiver/prometheusremotewrite`: Fixed a concurrency bug in the Prometheus remote write receiver where concurrent requests with identical job/instance labels would return empty responses after the first successful request. ([#42159](open-telemetry/opentelemetry-collector-contrib#42159)) ### v0.138.0 - [**FEATURE**] `receiver/prometheus`: added NHCB(native histogram wit custom buckets) to explicit histogram conversion ([#41131](open-telemetry/opentelemetry-collector-contrib#41131)) ## Summary | Category | Count | |----------|-------| | Breaking Changes | 4 | | Features | 6 | | Bug Fixes | 4 | | Other Changes | 1 | | **Total** | **15** | # Target-allocator Changes ## v0.136.0 to v0.142.0 Generated on: 2026-01-11 07:07:05 --- ### 0.142.0 - [**FEATURE**] `target allocator`: Add support for prometheus scrape classes ([#3600](open-telemetry/opentelemetry-operator#3600)) Added support for configuring `scrapeClasses` when using the PrometheusCR-feature of the target allocator. The format of the `scrapeClasses` array is exactly as same as `spec.scrapeClasses` of the `Prometheus` CRD. - [**BUG FIX**] `target allocator`: Fix CA certificate race condition with client cert renewals by extending its duration and and renewal attempt. ([#4441](open-telemetry/opentelemetry-operator#4441)) The CA certificate now has a 2-year duration (instead of the default 90 days) to prevent race conditions where client and server certificates could be signed by different CA versions during simultaneous renewal. This ensures the CA remains stable while dependent certificates renew regularly. ### 0.141.0 - [**FEATURE**] `target allocator`: make evaluation_interval configurable for Prometheus CR watcher ([#4520](open-telemetry/opentelemetry-operator#4520)) ### 0.140.0 - [**BUG FIX**] `github action`: Remove unused VERSION and VERSION_DATE environment variables from publish workflows ([#4470](open-telemetry/opentelemetry-operator#4470)) Removed the unused "Read version" step that set VERSION and VERSION_DATE environment variables in both publish-target-allocator.yaml and publish-operator-opamp-bridge.yaml workflows. These variables were never referenced anywhere in the workflows. ### 0.138.0 - [**BREAKING**] `target allocator`: Remove the operator.collector.targetallocatorcr feature flag ([#2422](open-telemetry/opentelemetry-operator#2422)) This behavior has been enabled by default since version 0.127.0. - [**BUG FIX**] `target allocator`: Add missing TA ownership watches to cert-manager Certificate and Issuer ([#4368](open-telemetry/opentelemetry-operator#4368)) ### 0.137.0 - [**BREAKING**] `target allocator`: Promote the operator.collector.targetallocatorcr feature flag to Stable ([#2422](open-telemetry/opentelemetry-operator#2422)) The flag can no longer be disabled. It will be completely removed in 0.138.0. - [**BUG FIX**] `target allocator, opamp`: Fix version not being updated after version upgrade. ([#4378](open-telemetry/opentelemetry-operator#4378)) - [**BUG FIX**] `target-allocator`: Fixed potential duplicate scrape targets caused by Prometheus relabeling. ([#3617](open-telemetry/opentelemetry-operator#3617)) ## Summary | Category | Count | |----------|-------| | Breaking Changes | 2 | | Features | 2 | | Bug Fixes | 5 | | Other Changes | 0 | | **Total** | **9** | --------- [comment]: # (Note that your PR title should follow the conventional commit format: https://conventionalcommits.org/en/v1.0.0/#summary) # PR Description [comment]: # (The below checklist is for PRs adding new features. If a box is not checked, add a reason why it's not needed.) # New Feature Checklist - [ ] List telemetry added about the feature. - [ ] Link to the one-pager about the feature. - [ ] List any tasks necessary for release (3P docs, AKS RP chart changes, etc.) after merging the PR. - [ ] Attach results of scale and perf testing. [comment]: # (The below checklist is for code changes. Not all boxes necessarily need to be checked. Build, doc, and template changes do not need to fill out the checklist.) # Tests Checklist - [ ] Have end-to-end Ginkgo tests been run on your cluster and passed? To bootstrap your cluster to run the tests, follow [these instructions](/otelcollector/test/README.md#bootstrap-a-dev-cluster-to-run-ginkgo-tests). - Labels used when running the tests on your cluster: - [ ] `operator` - [ ] `windows` - [ ] `arm64` - [ ] `arc-extension` - [ ] `fips` - [ ] Have new tests been added? For features, have tests been added for this feature? For fixes, is there a test that could have caught this issue and could validate that the fix works? - [ ] Is a new scrape job needed? - [ ] The scrape job was added to the folder [test-cluster-yamls](/otelcollector/test/test-cluster-yamls/) in the correct configmap or as a CR. - [ ] Was a new test label added? - [ ] A string constant for the label was added to [constants.go](/otelcollector/test/utils/constants.go). - [ ] The label and description was added to the [test README](/otelcollector/test/README.md). - [ ] The label was added to this [PR checklist](/.github/pull_request_template). - [ ] The label was added as needed to [testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml). - [ ] Are additional API server permissions needed for the new tests? - [ ] These permissions have been added to [api-server-permissions.yaml](/otelcollector/test/testkube/api-server-permissions.yaml). - [ ] Was a new test suite (a new folder under `/tests`) added? - [ ] The new test suite is included in [testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml). Co-authored-by: azure-monitor-assistant[bot] <217255729+azure-monitor-assistant[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Rashmi Chandrashekar <rashmy@microsoft.com>
Description:
Link to tracking Issue(s):
Testing:
Documentation: