Skip to content

Dependency: Update to Prometheus 0.300.*#36873

Merged
songy23 merged 20 commits into
open-telemetry:mainfrom
ArthurSens:update-to-prom3
Feb 13, 2025
Merged

Dependency: Update to Prometheus 0.300.*#36873
songy23 merged 20 commits into
open-telemetry:mainfrom
ArthurSens:update-to-prom3

Conversation

@ArthurSens
Copy link
Copy Markdown
Member

@ArthurSens ArthurSens commented Dec 17, 2024

Description

Supersedes #36642

This PR updates the prometheus/prometheus library in our go.mods to 0.300.* (which represents Prometheus 3.0).

It touches many go.mod files, but the Prometheus Receiver is the only component heavily affected by breaking changes.

@dashpole
Copy link
Copy Markdown
Contributor

Some questions, but lets get this merged and fix them afterwards

@dashpole
Copy link
Copy Markdown
Contributor

Can you re-open the compliance PR?

@ArthurSens
Copy link
Copy Markdown
Member Author

Here you go prometheus/compliance#143

@dashpole dashpole added the ready to merge Code review completed; ready to merge by maintainers label Feb 13, 2025
@songy23 songy23 merged commit af6b17e into open-telemetry:main Feb 13, 2025
@github-actions github-actions Bot added this to the next release milestone Feb 13, 2025
@ArthurSens
Copy link
Copy Markdown
Member Author

That was some work, thank you everyone who helped here ❤️

@basti1302
Copy link
Copy Markdown
Contributor

@ArthurSens I think this breaks existing configs.

Note: I'm building a custom collector via https://github.com/open-telemetry/opentelemetry-collector/tree/main/cmd/builder, so YMMV. But as far as I can tell, this will also affect users that do not build their own image.

After updating to github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prometheusreceiver v0.120.0 and starting the collector (on K8s, as a daemonset) with a previously working config, the collector pods go into a CrashLoopBackOff, with this error in the logs:

Error: invalid configuration: receivers::prometheus::config::scrapeconfigs::0::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
receivers::prometheus::config::scrapeconfigs::1::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
2025/02/18 16:02:18 collector server run finished with error: invalid configuration: receivers::prometheus::config::scrapeconfigs::0::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
receivers::prometheus::config::scrapeconfigs::1::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
stream closed EOF for dash0-system/dash0-operator-opentelemetry-collector-agent-daemonset-jwpwm (opentelemetry-collector)

Setting fallback_scrape_protocol: "PrometheusText1.0.0" on all scrape jobs solves the issue. However, it seems the intention of this PR was to update Prometheus as a non-breaking change? It appears addDefaultFallbackScrapeProtocol is not effective? My guess is that the validation kicks in before the factory has a chance to set the default value.

@ArthurSens
Copy link
Copy Markdown
Member Author

@ArthurSens I think this breaks existing configs.

Note: I'm building a custom collector via https://github.com/open-telemetry/opentelemetry-collector/tree/main/cmd/builder, so YMMV. But as far as I can tell, this will also affect users that do not build their own image.

After updating to github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prometheusreceiver v0.120.0 and starting the collector (on K8s, as a daemonset) with a previously working config, the collector pods go into a CrashLoopBackOff, with this error in the logs:

Error: invalid configuration: receivers::prometheus::config::scrapeconfigs::0::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
receivers::prometheus::config::scrapeconfigs::1::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
2025/02/18 16:02:18 collector server run finished with error: invalid configuration: receivers::prometheus::config::scrapeconfigs::0::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
receivers::prometheus::config::scrapeconfigs::1::scrapefallbackprotocol: unknown scrape protocol , supported: [OpenMetricsText0.0.1 OpenMetricsText1.0.0 PrometheusProto PrometheusText0.0.4 PrometheusText1.0.0]
stream closed EOF for dash0-system/dash0-operator-opentelemetry-collector-agent-daemonset-jwpwm (opentelemetry-collector)

Setting fallback_scrape_protocol: "PrometheusText1.0.0" on all scrape jobs solves the issue. However, it seems the intention of this PR was to update Prometheus as a non-breaking change? It appears addDefaultFallbackScrapeProtocol is not effective? My guess is that the validation kicks in before the factory has a chance to set the default value.

You're correct! We're trying to fix the problem here: #38018

basti1302 added a commit to dash0hq/dash0-operator that referenced this pull request Feb 18, 2025
basti1302 added a commit to dash0hq/dash0-operator that referenced this pull request Feb 18, 2025
@basti1302
Copy link
Copy Markdown
Contributor

You're correct! We're trying to fix the problem here: #38018

Ah, there is already an issue for it. Sorry about the ping then. I searched the repo for existing issues, but I searched for scrapefallbackprotocol and unknown scrape protocol, that is, parts of the error message and didn't find that existing issue. Thanks for the link to the issue 👍

jpkrohling pushed a commit that referenced this pull request Feb 19, 2025
…dation (#38018)

#### Description
During the release process, we noticed a breaking change in the
Prometheus receiver caused by
#36873.

In that PR, I tried adding a fallback scrape protocol by default
everytime the PrometheusReceiver was built, but it turned out that
config validation happened even before the component was created. And
the collector fails startup with invalid configuration

This PR moves the addition of scrape protocol to the validation step.

<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes #37902

<!--Describe what testing was performed and which tests were added.-->
#### Testing

<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->
ArthurSens added a commit to ArthurSens/opentelemetry-collector-contrib that referenced this pull request Feb 20, 2025
…dation (open-telemetry#38018)

During the release process, we noticed a breaking change in the
Prometheus receiver caused by
open-telemetry#36873.

In that PR, I tried adding a fallback scrape protocol by default
everytime the PrometheusReceiver was built, but it turned out that
config validation happened even before the component was created. And
the collector fails startup with invalid configuration

This PR moves the addition of scrape protocol to the validation step.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
Fixes open-telemetry#37902

<!--Describe what testing was performed and which tests were added.-->

<!--Describe the documentation added.-->

<!--Please delete paragraphs that you did not use before submitting.-->
songy23 added a commit that referenced this pull request Feb 21, 2025
#### Description

#36873
the prometheus receiver can now keep dots in metric names rather than
converting them to underscores. E.g. say there is a metric `my.metric`
scraped from prometheus receiver, its name is `my_metric` before 0.120.0
vs. `my.metric` now. This should have broken some datadog integration
tests, but those are skipped in race detector (which is always on in
CIs) so the failures did not show up in CIs.
songy23 added a commit that referenced this pull request Feb 21, 2025
#### Description
Add a changelog amending #36873. There are indeed several breaking
changes associated with the 3.0 version update:
https://prometheus.io/docs/prometheus/latest/migration/

#### Link to tracking issue
related to
https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/38097
cmacknz added a commit to cmacknz/opentelemetry-collector-contrib that referenced this pull request Mar 13, 2025
cmacknz added a commit to elastic/opentelemetry-collector-contrib that referenced this pull request Mar 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants