Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Child Activity outliving its parent corrupts Activity.Current #101775

Open
antonfirsov opened this issue May 1, 2024 · 9 comments
Open

Child Activity outliving its parent corrupts Activity.Current #101775

antonfirsov opened this issue May 1, 2024 · 9 comments

Comments

@antonfirsov
Copy link
Member

According to https://github.com/open-telemetry/opentelemetry-specification/blob/v1.31.0/specification/trace/api.md#end, child spans are allowed to outlive their parents:

End MUST NOT have any effects on child spans. Those may still be running and can be ended later.

However in .NET such activities may lead to weird behavior resulting in Activity.Current becoming an activity that has been already stopped, leaving devs without an option to leave the corrupted state:

Activity a = new Activity("a").Start();
Activity b = new Activity("b").Start();
Activity c = new Activity("c").Start();

b.Stop();
Console.WriteLine(Activity.Current!.OperationName); // Prints "a"
c.Stop();
Console.WriteLine(Activity.Current!.OperationName); // Prints "b" -- already stopped!
b.Stop(); // Does nothing
Console.WriteLine(Activity.Current!.OperationName); // Still prints "b"
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 1, 2024
Copy link
Contributor

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti
See info in area-owners.md if you want to be subscribed.

@tarekgh
Copy link
Member

tarekgh commented May 1, 2024

I think you are mixing between Parent and Current.

Doing:

            b.Stop();
            Console.WriteLine($"Current: {Activity.Current!.OperationName} ... b.Parent: {b.Parent!.OperationName}, b.Parent.IsStopped: {b.Parent!.IsStopped} ... c.IsStopped: {c.IsStopped}");

print

Current: a ... b.Parent: a, b.Parent.IsStopped: False ... c.IsStopped: False // clearly stopping the parent didn't stop the child.

I can see we may improve the Current behavior but I am not sure if this will be a breaking change?

CC @noahfalk

@tarekgh tarekgh added this to the Future milestone May 1, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label May 1, 2024
@tarekgh tarekgh removed the untriaged New issue has not been triaged by the area owner label May 1, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-diagnostics-activity
See info in area-owners.md if you want to be subscribed.

@antonfirsov
Copy link
Member Author

antonfirsov commented May 1, 2024

clearly stopping the parent didn't stop the child.

I didn't state that that stopping the parent stops the child. It's all about Activity.Current flipping into a corrupt state, which then messes up parent-child relationships of all future activities created. I quoted the standard only to emphasize that such cases are valid.

Actually, the same problem is present when there is no intention to create a parent-child relationship between the activities, ie. when creating c via an ActivitySource + manually provided ActivityContext:

Activity c = activitySource.CreateActivity("c", ActivityKind.Internal,
    new ActivityContext(ActivityTraceId.CreateRandom(), ActivitySpanId.CreateRandom(), ActivityTraceFlags.None));

This is preventing us from implementing might be problematic if we implement #93832: we want the connection activity to be unrelated to Activity.Current when we create it, and expect it to outlive the previos Activity.Current (the HTTP request activity).

@antonfirsov antonfirsov added blocking Marks issues that we want to fast track in order to unblock other important work and removed blocking Marks issues that we want to fast track in order to unblock other important work labels May 1, 2024
@tarekgh
Copy link
Member

tarekgh commented May 1, 2024

I didn't state that that stopping the parent stops the child.

You were referring to the OTel specs that says: MUST NOT have any effects on child spans. Those may still be running and can be ended later. 😄.

Anyway, I am not objecting we try to fix the Current behavior. I am just checking if we should worry about the breaking change.

@noahfalk
Copy link
Member

noahfalk commented May 3, 2024

I'd consider the current behavior a bug and I think it would be legitimate for us to fix it. I'd guess it has very low (but not zero) odds of breaking someone.

leaving devs without an option to leave the corrupted state:

Activity.Current has a public setter which is one way it could be recovered. In your example calling a.Stop() would also recover it. In async code returning to the caller also restores the caller's async locals so thats another option.

This is preventing us from implementing might be problematic if we implement #93832: we want the connection activity to be unrelated to Activity.Current when we create it,

.NET doesn't have an explicit API to create new root Activities, but you can set Activity.Current=null as a workaround like this:

private async ValueTask<(Stream, TransportContext?, Activity?, IPEndPoint?)> ConnectAsync(HttpRequestMessage request, bool async, CancellationToken cancellationToken)
{
    // assume Activity.Current is set to some arbitrary http request activity that should not
    // be the parent of this activity
    Activity.Current = null;
    Activity connectionActivity = s_connectionActivitySource.StartActivity(
       DiagnosticsHandlerLoggingStrings.ConnectionActivityName);
}

If you weren't in an async function you'd probably want to save and restore the previous value of Activity.Current, but returning from an async function restores the parent's async locals automatically.

@Wraith2
Copy link
Contributor

Wraith2 commented May 5, 2024

private async ValueTask<(Stream, TransportContext?, Activity?, IPEndPoint?)> ConnectAsync(HttpRequestMessage request, bool async, CancellationToken cancellationToken)
{
    // assume Activity.Current is set to some arbitrary http request activity that should not
    // be the parent of this activity
    Activity.Current = null;
    Activity connectionActivity = s_connectionActivitySource.StartActivity(
       DiagnosticsHandlerLoggingStrings.ConnectionActivityName);
}

This feels like an unintuative pattern. I would expect to have a parameter on the StartActivity call to de-parent the child or to explicitly call SetParentId. This looks like you're changing static state in order to disconnect the child and requires complex knowledge to understand that it will be restored at exit.

@KalleOlaviNiemitalo
Copy link

Activity.Current becoming an activity that has been already stopped

That was reported in #91265 as well.

I would expect to have a parameter on the StartActivity call to de-parent the child or to explicitly call SetParentId.

De-parenting in StartActivity is #65528.

@noahfalk
Copy link
Member

noahfalk commented May 6, 2024

This feels like an unintuative pattern

Yep, no claim that it is easy or intuitive. Its just what is currently available to work with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants