-
Notifications
You must be signed in to change notification settings - Fork 769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create span when SamplingDecision.Drop #3841
Create span when SamplingDecision.Drop #3841
Conversation
Time to really comply with the spec 🤣. I vaguely remember some discussions around this in some SIG calls, and we still decided to use the current approach.... Maybe the heap allocation was a bigger concern? |
Yes! Was just talking to @CodeBlanch about this. Allocations gonna be painful (note the name of my branch 😆) But this may be a hit we'll need to take... it's important for both propagating context properly and also correlation of trace data to logs and metrics. |
If a root span (t1, s1) is created.. then its child gets not created, and then someone makes a log. the log will be parented to the (t1,s1). With this change, the child span gets created (t1, s2, parent=s1), but wont see processor or exporter. The log will now be parented to t1,s2.. which might be an issue as log is now pointing to a parent, who is nowhere to be found.... |
unless we make log point to the span which is sampled or root. (like walk the .Parent chain until we run out of parent)... |
My understanding is that this is by design. That is, incomplete traces are to be expected as well as telemetry correlated to things that happened but were not recorded. |
If walking the chain is required, then I think this needs to be clarified by the spec. |
A couple tags... |
Closing for now... as discussed in this week's SIG, we do want to make this change, but we want to wait until after the 1.4.0 release. I will pursue fixing up the tests and open the PR back up later. |
FYI; tested locally and confirmed this change would fix the issue reported in #4087 |
@TimothyMothra Thanks for investigating it more! This is tagged to resurrected for 1.5. Please chime in with a +1 to the original issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Lets add a good writeup in the PR description about the potential impact this has, and a changelog entry.
Highlighting that for a asp.net core + httpclient scenario, the impact is nil, as these libraries would anyway create activities, so no impact.
Higher allocation, for the case when there are more than span in the trace within proc..
@Oberon00 |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3841 +/- ##
==========================================
- Coverage 85.71% 85.62% -0.09%
==========================================
Files 289 289
Lines 11374 11341 -33
==========================================
- Hits 9749 9711 -38
- Misses 1625 1630 +5
|
@@ -240,7 +239,7 @@ internal sealed class TracerProviderSdk : TracerProvider | |||
else if (this.sampler is AlwaysOffSampler) | |||
{ | |||
listener.Sample = (ref ActivityCreationOptions<ActivityContext> options) => | |||
!Sdk.SuppressInstrumentation ? PropagateOrIgnoreData(options.Parent) : ActivitySamplingResult.None; | |||
!Sdk.SuppressInstrumentation ? ActivitySamplingResult.PropagationData : ActivitySamplingResult.None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is extremely hard to read. There is some deep nesting of if/else/Listener callback code going on here. I would suggest to extract some (named, top-level) methods to make the code more readable/reviewable.
Maybe only tangentially related but: I wonder how one would implement transparent API-only propagation in .NET, where you just echo through extracted incoming span contexts: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#behavior-of-the-api-in-the-absence-of-an-installed-sdk |
@alanwest an Activity is fairly expensive from the garbage collection point of view. Could we explore if it's possible to defer the creation only when some code is reading it? Alternatively, maybe explore if a slim version of Activity can be used/reused rather than a full version. |
Maybe System.Diagnostics.Activity is a bad match for the OpenTelemetry Span interface, since I guess the OTel spec was written imagining that a non-recording Span can be implemented rather efficiently. |
I think solving this issue within the opentelemetry-dotnet project may prove challenging. Either deferring creation or implementing a slim version may require instrumentation authors to depend on OpenTelemetry.Api rather than just System.Diagnosics to retrieve current span context. I think this is something we're aiming to minimize.
Talking with @CodeBlanch and trying to brainstorm options we might suggest to .NET team to make Activity itself lighter weight. Though, I guess the first question is whether the size of Activity is the primary concern. In other words, is Activity expensive from a GC point of view because of its size? Or is it expensive irrespective of size and we really just want to avoid heap allocations altogether? |
This might be difficult/impossible. It might be possible to do just the baggage part, as once can use Propagators to extract it, store in Baggage.Current, and it can be injected to downstream calls as well. |
https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/test/Benchmarks/Trace/TraceBenchmarks.cs#L32 Its 416 B per activity minimum without any tags. Adding tags will make it ~650B/activity. |
Something like ActivityContext.Current (like Baggage.Current) which can be thought of as just the context part, and can be valid/used even if Activity.Current is null? |
The trouble is, I'm not sure how something like this would work. Consider the following: using (var span1 = source.StartActivity("SampledSpan"))
{
using (var span2 = source.StartActivity("NotSampledSpan"))
{
// span2 is null here, but let's imagine that ActivityContext.Current = span2's context
// logs and exemplars are correlated to span2's context
log.Info("Some log message.");
histogram.Record(100);
// span2's context is propagated
httpClient.SendAsync(...);
}
// How do we that the scope of span2 has ended?
// We'd need something like ActivityContext.Current.Stop(), but how would this get invoked?
} |
My sense is that we're stuck with working within the Activity API as is and optimizing perf here will require collaborating with the .NET team. Could |
This could probably be reduced by doing something like: class Activity
{
// This would be null if this Activity was not sampled
private ActivityData Data;
// This would be a valid context
private ActivityContext context;
}
class ActivityData
{
// Move all the fields on Activity today to this class
// basically the fields here: https://github.com/dotnet/runtime/blob/0a15c3b97dabaf94e5d4399aef2fd48c2b3f446a/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs#L90-L116
} But I'm not sure this helps address the primary concern. Activities would still be allocated and require garbage collection. We'd need to explore if reducing the size of Activity reduces the overhead of garbage collection. |
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Fixes #3290
Fixes #4087
Create spans even on a sampling decision of drop. This conforms the .NET SDK to the specification regarding span creation.
This may have a performance impact as an
Activity
will always be allocated irrespective of sampling decision. Notable exceptions are the ASP.NET Core and HttpClient instrumentation. The Activity created by these libraries was already always allocated.