A span with a new span ID MUST be created even for a sampling decision of DROP. #3290

Oberon00 · 2022-05-19T15:40:04Z

Bug Report

This bug should be present in master and at least since 0.5.0-beta (probably since the beginning)

Symptom

If the sampler returns a decision of DROP, StartActivity returns null.

What is the expected behavior?

An unsampled activity with a new span ID is created and returned (and set as current which is not spec-conformant but expected in .NET). This allows correlating logs also to unsampled operations.

Spec references:

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#shouldsample

DROP - IsRecording() == false, span will not be recorded and all events and attributes will be dropped.

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#sdk-span-creation

Generate a new span ID for the Span, independently of the sampling decision. This is done so other components (such as logs or exception handling) can rely on a unique span ID, even if the Span is a non-recording instance.

[...] A non-recording span MAY be implemented using the same mechanism as when a Span is created without an SDK installed or as described in wrapping a SpanContext in a Span.

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#wrapping-a-spancontext-in-a-span

The behavior is defined as follows:

GetContext MUST return the wrapped SpanContext.

What is the actual behavior?

Null is returned for the activity, no span ID is available.

Reproduce

There is a unit test for the wrong behavior here:

opentelemetry-dotnet/test/OpenTelemetry.Tests/Trace/TracerProviderSdkTest.cs

Lines 274 to 277 in 641b2f7

    
           // This is not a root activity. 
        
           // If sampling returns false, no activity is created at all. 
        
           using var innerActivity = activitySource.StartActivity("inner"); 
        
           Assert.Null(innerActivity);

Additional Context

This is a pretty big deviation from the spec that will inhibit some use cases. Creating a new span ID even for unsampled spans was a very deliberate decision. There were even two competing PRs open-telemetry/opentelemetry-specification#1225 and open-telemetry/opentelemetry-specification#998 and it was explicitly decided on always creating a span ID. See especially the thread ending at open-telemetry/opentelemetry-specification#998 (comment)

Oberon00 · 2022-05-19T15:46:11Z

Note that in the case I observed, I even got a null activity despite there being no current span (the parent context was extracted from HTTP headers, having IsRemote=true, TraceFlags=None)

bledogit · 2022-08-15T23:51:07Z

Up voting this issue. Furthermore, the behavior of StartActivity should be such that an activity is created with or without the sampled flag turned on, depending on the sampler settings, and the sampler should be such that an exception can change the DROP decision of the Activity and All its parents to record failed Spans created locally.

There should be at least the option to be able to record the root activity resulting of a transaction with no parent, like an API entry point which should return the traceID in the response (https://github.com/w3c/trace-context/blob/main/spec/21-http_response_header_format.md) . The returned TraceID (with possibly an orphan flag) then can be used to query the DB for spans with exceptions coming from an unsampled chain of Spans that where not sampled.

The main objective is to sample a percentage of the healthy transactions, while recording 100% of the local spans that resulted in an error/exception. We must be able to answer the question as why a transaction 00-ttt-xxx-00 failed. Without this distributed tracing becomes no better than debug logs.

erchirag · 2022-11-01T20:49:49Z

Can someone comment on any performance implications of creating activity object always even when sampling decision was to not collect current span ?

cijothomas · 2022-11-01T20:55:21Z

https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/test/Benchmarks/Trace/TraceBenchmarks.cs#L38
Creating an activity, and feeding it to a no-op processor, is approximately of this (see above ^) much cost.

cijothomas · 2022-11-01T20:57:08Z

https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/test/Benchmarks/Trace/TraceBenchmarks.cs#L38 Creating an activity, and feeding it to a no-op processor, is approximately of this (see above ^) much cost.

https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/test/Benchmarks/Trace/OpenTelemetrySdkBenchmarksActivity.cs#L38

And can go up, if tags/attributes are added. (most likely avoidable, by checking IsAllDataRequested before populating tags)

erchirag · 2022-11-01T21:29:10Z

What about any GC impact if lots of these objects are created in tight loop ?

cijothomas · 2022-11-01T21:40:40Z

The benchmarks shows the heap allocation (416B), so yes GC has to clean it up.

bledogit · 2022-11-02T13:34:23Z

If performance is an issue, can't these objects be pooled?

cijothomas · 2023-02-14T21:50:47Z

Tagging for consideration in 1.5.0

TimothyMothra · 2023-02-16T01:15:36Z

+1.
I think this is the root cause of #4087.

cijothomas · 2023-03-21T18:58:55Z

The PR to address got stale and closed. I suggest to move this out of 1.5 timeline, as this likely requires more discussions.

github-actions · 2024-10-08T03:19:44Z

This issue was marked stale due to lack of activity and will be closed in 7 days. Commenting will instruct the bot to automatically remove the label. This bot runs once per day.

cijothomas · 2024-10-08T04:58:28Z

Not stale, spec violation.

Oberon00 added the bug Something isn't working label May 19, 2022

rypdal mentioned this issue Jun 9, 2022

AWS Lambda wrapper enhancements: Trace methods enhancements and Activity creation behaviour changes. open-telemetry/opentelemetry-dotnet-contrib#408

Merged

alanwest mentioned this issue Oct 27, 2022

[Http] Fix propagation issues #3828

Merged

1 task

CodeBlanch mentioned this issue Oct 28, 2022

Create span when SamplingDecision.Drop #3841

Closed

cijothomas added this to the 1.5.0 milestone Feb 14, 2023

TimothyMothra mentioned this issue Feb 15, 2023

OpenTelemetry.Shims.OpenTracing Invalid 'SpanContext' when not sampled #4087

Closed

alanwest removed this from the 1.5.0 milestone Apr 20, 2023

cijothomas mentioned this issue Oct 25, 2023

Question about samplers, null activity and IsAllDataRequested=false #4987

Closed

This was referenced Dec 23, 2023

GraphQLTelemetryProvider improvements graphql-dotnet/graphql-dotnet#3847

Closed

Update the GraphQLTelemetryOptions.Filter delegate to not create any downstream traces when false graphql-dotnet/graphql-dotnet#3850

Merged

CodeBlanch mentioned this issue Jan 11, 2024

System.Diagnostics.Activity memory footprint concern dotnet/runtime#96822

Open

cijothomas mentioned this issue Jan 21, 2024

Always create Activity irrespective of SamplingDecision. #1709

Closed

CodeBlanch mentioned this issue Feb 20, 2024

[trace-sdk-sampler] Differentiate between root and child span creation in trace sampling spec open-telemetry/opentelemetry-specification#3888

Closed

1 task

github-actions bot added the Stale Issues and pull requests which have been flagged for closing due to inactivity label Oct 8, 2024

cijothomas removed the bug Something isn't working label Oct 8, 2024

Kielek added this to the Future milestone Oct 8, 2024

TimothyMothra removed the Stale Issues and pull requests which have been flagged for closing due to inactivity label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A span with a new span ID MUST be created even for a sampling decision of DROP. #3290

A span with a new span ID MUST be created even for a sampling decision of DROP. #3290

Oberon00 commented May 19, 2022 •

edited

Loading

Oberon00 commented May 19, 2022

bledogit commented Aug 15, 2022

erchirag commented Nov 1, 2022

cijothomas commented Nov 1, 2022

cijothomas commented Nov 1, 2022

erchirag commented Nov 1, 2022

cijothomas commented Nov 1, 2022

bledogit commented Nov 2, 2022

cijothomas commented Feb 14, 2023

TimothyMothra commented Feb 16, 2023

cijothomas commented Mar 21, 2023

github-actions bot commented Oct 8, 2024

cijothomas commented Oct 8, 2024

A span with a new span ID MUST be created even for a sampling decision of DROP. #3290

A span with a new span ID MUST be created even for a sampling decision of DROP. #3290

Comments

Oberon00 commented May 19, 2022 • edited Loading

Bug Report

Symptom

Reproduce

Additional Context

Oberon00 commented May 19, 2022

bledogit commented Aug 15, 2022

erchirag commented Nov 1, 2022

cijothomas commented Nov 1, 2022

cijothomas commented Nov 1, 2022

erchirag commented Nov 1, 2022

cijothomas commented Nov 1, 2022

bledogit commented Nov 2, 2022

cijothomas commented Feb 14, 2023

TimothyMothra commented Feb 16, 2023

cijothomas commented Mar 21, 2023

github-actions bot commented Oct 8, 2024

cijothomas commented Oct 8, 2024

Oberon00 commented May 19, 2022 •

edited

Loading