Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky micrometer tests #5106

Merged

Conversation

mateuszrzeszutek
Copy link
Member

Fixes #5103

This PR presents yet another reason why we'd like to be able to use multiple callbacks in async instruments (open-telemetry/opentelemetry-specification#2249) -- turns out the library instrumentation tests were randomly failing because each of them created a new OpenTelemetryMeterRegistry, and they did not share state - and the OpenTelemetry used in library tests is shared between all test classes.

@mateuszrzeszutek mateuszrzeszutek requested a review from a team January 13, 2022 11:37
@mateuszrzeszutek
Copy link
Member Author

Hmm, apparently that's still not it, still getting these

java.lang.NullPointerException
	at com.gradle.scan.plugin.internal.i.g.b(SourceFile:61)

@mateuszrzeszutek mateuszrzeszutek marked this pull request as draft January 13, 2022 12:12
@@ -21,10 +21,15 @@
import java.util.function.ToLongFunction;
import javax.annotation.Nullable;

// lalala test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fafafa

@mateuszrzeszutek mateuszrzeszutek force-pushed the fix-micrometer-flaky-tests branch from 43d11a1 to 60487a9 Compare January 13, 2022 16:41
@trask trask added this to the v1.10.0 milestone Jan 13, 2022
@mateuszrzeszutek mateuszrzeszutek marked this pull request as ready for review January 13, 2022 17:58
@mateuszrzeszutek
Copy link
Member Author

Okay, I think these cryptic NPEs I observed earlier may be related to #5051, not micrometer tests (probably), so I'm reopening it.

private final Meter meter;
// using a weak ref so that the AsyncInstrumentRegistry (which is stored in a static maps) does
// not hold strong references to Meter (and thus make it impossible to collect Meter garbage).
// in practice this should never return null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not obvious to me why Meter can't be GC'd before AsyncInstrumentRegistry, can you explain briefly, or what do you think about using weak value in the cache instead Cache<Meter, WeakReference<AsyncInstrumentRegistry>>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't be GC'd because the OpenTelemetryMeterRegistry maintains a strong reference to both Meter and AsyncInstrumentRegistry -- so if the registry instance is collected then the registry cannot possibly be used anymore. I'll add a comment.

or what do you think about using weak value in the cache instead Cache<Meter, WeakReference<AsyncInstrumentRegistry>>?

Hmm, in case the meter registry is GC'd the WeakReference<AsyncInstrumentRegistry> could get nulled out even if the Meter is not GC'd yet (because the OpenTelemetry instance is still used) - this hypothetical scenario would break the async instruments, cause if you create a new OpenTelemetryMeterRegistry for the same OpenTelemetry/Meter the async instrument registry will get cleared out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, got it 👍

I'll add a comment

❤️

@trask trask merged commit 6eeb6cc into open-telemetry:main Jan 13, 2022
@mateuszrzeszutek mateuszrzeszutek deleted the fix-micrometer-flaky-tests branch January 14, 2022 08:34
RashmiRam pushed a commit to RashmiRam/opentelemetry-auto-instr-java that referenced this pull request May 23, 2022
* Fix flaky micrometer tests

* Add comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sporadic micrometer library test failure
3 participants