Skip to content

[exporter/prometheusremotewrite] support disabling target_info metric #12724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

hardproblems
Copy link
Contributor

Description:
Add the ability to turn off the target_info metric to address duplicate series / out of order samples issue below

Link to tracking Issue:
#12300

Testing:

  • added unit test in pkg/translator/prometheusremotewrite/helper.go

Documentation:

  • added a comment in config.go
  • added explanation for the param in README

@dashpole @Aneurysm9 @bogdandrutu @gouthamve @jmacd @jsuereth

@hardproblems hardproblems requested a review from a team July 25, 2022 23:46
@hardproblems hardproblems requested a review from Aneurysm9 as a code owner July 25, 2022 23:46
@jmacd
Copy link
Contributor

jmacd commented Jul 27, 2022

Speculating that target_info has a bug.

#12300 (comment)

I would probably add to the current specification that if the target_info has no fields, it should be automatically omitted.

@jmacd
Copy link
Contributor

jmacd commented Jul 27, 2022

Shouldn't you have the option remove this at the source, i.e., in the prometheus receiver? You've described a single-writer violation happening and I'd rather it be removed before it enters the OTLP pipeline. Otherwise, receivers other than PRW will face the same single-writer violations.

@dashpole
Copy link
Contributor

It is possible today to droptarget_info in the prometheus receiver using metric_relabel_configs:

metric_relabel_configs:
- source_labels: [__name__]
  regex: target_info
  action: drop

There is an added issue that we add "implicit" resource attributes (e.g. net.host.name) from instance (described in #9967) which produces an outgoing target_info metric even if you dropped the incoming target_info in the prometheus receiver. It is also common for users to use automatic resource detectors (e.g. the gcp detector) to add resource information. This would end up appearing in the target_info metric (which is useful!) and may cause the target_info collision if job and instance are not identifying.

@jmacd
Copy link
Contributor

jmacd commented Jul 27, 2022

which produces an outgoing target_info metric even if you dropped the incoming target_info in the prometheus receiver

This sounds like another reason to move target_info generation to the receiver. Subsequent resource detectors and processors can edit the resource, but the decision to have or not have a target_info would be up to the receiver.

@hardproblems
Copy link
Contributor Author

hardproblems commented Jul 27, 2022

Shouldn't you have the option remove this at the source, i.e., in the prometheus receiver? You've described a single-writer violation happening and I'd rather it be removed before it enters the OTLP pipeline. Otherwise, receivers other than PRW will face the same single-writer violations.

We're using the otlp receiver. I can try using resource attributes processor to drop otel sdk added resources attributes. iirc job/instance labels are likely added by the exporter so no way to drop those using processors.

@jmacd
Copy link
Contributor

jmacd commented Jul 27, 2022

job/instance labels are likely added by the exporter so no way to drop those using processors.

This sounds like a problem worth fixing. When target_info has only two attributes "job" and "instance" it is redundant with up as generated by the prometheus receiver. I would support a PRWexporter option to not generate target_info, therefore, when it contains only job and instance.

@dashpole
Copy link
Contributor

This sounds like a problem worth fixing. When process_info has only two attributes "job" and "instance" it is redundant with up as generated by the prometheus receiver. I would support a PRWexporter option to not generate process_info, therefore, when it contains only job and instance.

Do you mean target_info instead of process_info? If so, I agree. That was the initial intention, but it doesn't look like it was implemented that way. We should be checking if there are no attributes after the RemoveIf that filters out job, instance.

@jmacd
Copy link
Contributor

jmacd commented Jul 27, 2022

@dashpole yes, I've corrected the comment. Thanks.

@hardproblems hardproblems force-pushed the rong--prw-target-metric-label branch from ba1d546 to f539e7f Compare August 15, 2022 23:40
@hardproblems
Copy link
Contributor Author

@mx-psi updated target_info config to use a struct. PTAL 🙇

@hardproblems
Copy link
Contributor Author

@Aneurysm9 PTAL as well 🙇

@dashpole dashpole added exporter/prometheusremotewrite ready to merge Code review completed; ready to merge by maintainers and removed ready to merge Code review completed; ready to merge by maintainers labels Aug 17, 2022
@dashpole
Copy link
Contributor

config_test.go:98:16: cannot use (TargetInfo literal) (value of type TargetInfo) as *TargetInfo value in struct literal (typecheck)
			TargetInfo: TargetInfo{

@hardproblems
Copy link
Contributor Author

@dashpole @mx-psi @Aneurysm9 can I get some help merging this? I'm not authorized to merge and I have to keep merging main to this while waiting for a stamp to trigger CI checks.

@dashpole dashpole added the ready to merge Code review completed; ready to merge by maintainers label Aug 22, 2022
@bogdandrutu
Copy link
Member

@dashpole there are lint errors in this PR.

@dashpole dashpole removed the ready to merge Code review completed; ready to merge by maintainers label Aug 22, 2022
@dashpole
Copy link
Contributor

@bogdandrutu agh, sorry. I thought the author had fixed it, but I had to approve the tests.

@hardproblems hardproblems force-pushed the rong--prw-target-metric-label branch 2 times, most recently from 6f67baf to d07ac43 Compare August 22, 2022 18:18
@hardproblems hardproblems force-pushed the rong--prw-target-metric-label branch from d07ac43 to 8c56545 Compare August 22, 2022 18:54
@dashpole
Copy link
Contributor

@hardproblems you can hold off rebasing on main for a bit (unless there is a conflict). The maintainers can do that before they merge. Its hard to tell if your tests are passing or not...

@hardproblems
Copy link
Contributor Author

@dashpole @Aneurysm9 @mx-psi @bogdandrutu what's the process for getting this merged? I've looked through the docs https://github.com/open-telemetry/opentelemetry-collector/blob/main/CONTRIBUTING.md and https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/release.md and am still unclear what the process is after the code reviews are done and the CI checks appear to be passing.

@dashpole dashpole added the ready to merge Code review completed; ready to merge by maintainers label Aug 24, 2022
@bogdandrutu bogdandrutu merged commit 09883f6 into open-telemetry:main Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/prometheusremotewrite ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants