Skip to content

Conversation

@clutchski
Copy link
Contributor

@clutchski clutchski commented Jun 22, 2016

adding a standard way of collecting exception information. it adds the following tags:

error.msg      # a human readable sentence of the error
error.stack    # a configurable number of lines of stacktrce (right now 20)
error.type     #  a string representing the type of the error (will be usually null on golang for example)

Here's a rough example of a span with errors:

        id 6676202706048782439
  trace_id 4335547038182019601
 parent_id None
   service None
  resource foo
      type None
     start 1466629552.42
       end
  duration None
     error 1
{'error.msg': u'integer division or modulo by zero',
 'error.stack': u'Traceback (most recent call last):\n  File "/home/vagrant/dd-trace-py/ddtrace/test_span.py", line 63, in test_traceback_with_error\n    1/0\nZeroDivisionError: integer division or modulo by zero\n',
 'error.type': u"<type 'exceptions.ZeroDivisionError'>"}

it also adds a few utility functions which you can see in the PR.

@clutchski
Copy link
Contributor Author

from @LotharSee:

a few things:

  • In your example, we could instead have 'error.type': 'ZeroDivisionError'
  • should we convert error to a Bool? and then make it clear in our doc that this attribute is meant to say if that’s a “hit” or an “error” in our stats and display
  • we might want to report a stack even if we don’t have an error. In which case the “stack” field could be higher level. It is also reasonable to prefix it with the language. Like python.stack.

@talwai
Copy link
Contributor

talwai commented Jun 23, 2016

Looks good to me. Worth remembering that the agent will truncate huge meta values so for really deep stack traces there's a chance we swallow the most recent calls (including the one that actually raises the exception). i dug through our sentry a bit and haven't found a raw stacktrace that pushes the limit we have though, so seems like a non-issue

@clutchski clutchski merged commit 8692a5c into master Jun 23, 2016
@clutchski clutchski deleted the matt/errors branch June 23, 2016 18:56
labbati added a commit that referenced this pull request Sep 12, 2018
Yun-Kim referenced this pull request in Yun-Kim/dd-trace-py Mar 16, 2021
mergify bot added a commit that referenced this pull request Mar 17, 2021
…pan, provider, helpers (#2180)

* Enabled type hinting/checking for context, monkey, span, helpers

* Type checked provider, addressed PR comments

* Attempt #2 to remove circular dependency

* Attempt to fix circular import #3

* Attempt to remove circular dependency #4

* Attempt to remove circular dependency #5

* Attempt 6 to remove circular dependency

* Revert type checking check in provider.py

* Reverted type check checking for tracer.py

* Changed mistaken int arg type to float in span.duration setter

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Julien Danjou <[email protected]>
mabdinur added a commit that referenced this pull request Jun 14, 2023
…6119)

## Description

Based on the failing test below we can see `test_single_trace_too_large`
generates over 20MB of trace data but these traces are sent in separate
payloads. This PR increases the time internal interval in the flaky test
to ensure the buffer fits the 500 span trace chunks into one payload.

## Background

This test consistently passes locally but often fails in ci (and mainly
on older versions of python). My hypothesis is that the time interval to
submit traces to the agent is too short. The test does not have enough
time to add the trace chunks to overflow the httpwriters buffer. This PR
increases the time interval of submitting a trace from 10s to 100s to
address this hypothesis.`tracer.shutdown()` is called and used to submit
payloads to the agent.

Sample failure:
https://app.circleci.com/pipelines/github/DataDog/dd-trace-py/37860/workflows/4da477db-5289-40b1-8375-5d6152f14b8f/jobs/2546383

## Testing Strategy 

YOLO. If `test_single_trace_too_large` is still flaky after this PR then
I have failed my reviewers and myself.

## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines)
are followed.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](../docs/contributing.rst#release-branch-maintenance))

## Reviewer Checklist

- [ ] Title is accurate.
- [ ] No unnecessary changes are introduced.
- [ ] Description motivates each change.
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Testing strategy adequately addresses listed risk(s).
- [ ] Change is maintainable (easy to change, telemetry, documentation).
- [ ] Release note makes sense to a user of the library.
- [ ] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](../docs/contributing.rst#release-branch-maintenance)
mabdinur added a commit that referenced this pull request Sep 28, 2023
@wconti27 wconti27 mentioned this pull request May 31, 2024
17 tasks
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```
github-actions bot pushed a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 29, 2024
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)
nsrip-dd added a commit that referenced this pull request Oct 30, 2024
…t 2.12] (#11215)

Backport 64b3374 from #11167 to 2.12.

The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to
wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
nsrip-dd added a commit that referenced this pull request Oct 30, 2024
…t 2.13] (#11214)

Backport 64b3374 from #11167 to 2.13.

The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to
wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
nsrip-dd added a commit that referenced this pull request Oct 30, 2024
…t 2.14] (#11213)

Backport 64b3374 from #11167 to 2.14.

The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to
wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
taegyunkim added a commit that referenced this pull request Oct 30, 2024
…t 2.15] (#11211)

Backport 64b3374 from #11167 to 2.15.

The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to
wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

(cherry picked from commit 64b3374)

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Taegyun Kim <[email protected]>
taegyunkim added a commit that referenced this pull request Oct 30, 2024
…t 2.16] (#11210)

Backport 64b3374 from #11167 to 2.16.

The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member
function sets the active span for a thread.
`get_active_span_from_thread_id`
accesses the map of spans under a mutex, but returns the pointer after
releasing the mutex, meaning `link_span` can modify the members of the
Span while the caller of `get_active_span_from_thread_id` is reading
them.

Fix this by returning a copy of the `Span`. Use a `std::optional` to
wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.

I added a simplistic regression test which fails reliably without this
fix
when built with the thread sanitizer enabled. Output like:

```
WARNING: ThreadSanitizer: data race (pid=2971510)
  Read of size 8 at 0x7b2000004080 by thread T2:
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
    #4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
    #5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
    #6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
    #8 <null> <null> (libstdc++.so.6+0xd6df3)

  Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
    #2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
    #3 get() <null> (thread_span_links+0xb570)
    #4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
    #5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
    #6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
    #7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Nick Ripley <[email protected]>
Co-authored-by: Taegyun Kim <[email protected]>
KowalskiThomas added a commit that referenced this pull request Nov 28, 2025
## Description

https://datadoghq.atlassian.net/browse/PROF-13114

This PR makes sure the Python Profiler's signal handler (for `SIGSEGV`
and `SIGBUS`) is properly installed when the Sampler thread starts.
Note that this (reinstalling our signal handler) does NOT break any
other signal handler (Python's or another extension's) as our signal
handler only swallows faults / jumps to the recovery path if it's been
"armed" (otherwise it re-raises). What matters is that we should be the
"first responder" when a fault happens.

This is an attempt to fix a crash we saw in the testing environment
where some workloads receive segmentation faults clearly coming from
`safe_memcpy` in FastAPI / Django apps.
The "real" root cause isn't yet known of me – Django and FastAPI don't
seem to use `PYTHONFAULTHANDLER` or `faulthandler` based on the Github
code – but after deploying those changes, we stopped seeing those
crashes (0 in the past 4 days).

<img width="2089" height="480" alt="image"
src="https://github.com/user-attachments/assets/c554ec0c-cfae-4311-bded-8082d8f79ed9"
/>


```
Error UnixSignal: Process terminated with SEGV_MAPERR (SIGSEGV)
#0   0x0000755d4295c18f safe_memcpy 
#1   0x0000755d42954879 copy_memory 
#2   0x0000755d42956826 GenInfo::create 
#3   0x0000755d42958c25 TaskInfo::create 
#4   0x0000755d42958cc7 TaskInfo::create 
#5   0x0000755d4295a618 ThreadInfo::unwind_tasks 
#6   0x0000755d4295b98b std::_Function_handler<void (_ts*, ThreadInfo&), Datadog::Sampler::sampling_thread(unsigned long)::{lambda(InterpreterInfo&)#1}::operator()(InterpreterInfo&) const::{lambda(_ts*, ThreadInfo&)#1}>::_M_invoke 
#7   0x0000755d42959906 for_each_thread 
#8   0x0000755d429599e5 std::_Function_handler<void (InterpreterInfo&), Datadog::Sampler::sampling_thread(unsigned long)::{lambda(InterpreterInfo&)#1}>::_M_invoke 
#9   0x0000755d4295c5b2 Datadog::Sampler::sampling_thread 
#10  0x0000755d4295c685 call_sampling_thread 
#11  0x0000755d45aeeea7 start_thread 
#12  0x0000755d45c05def clone 
```
avara1986 added a commit that referenced this pull request Dec 19, 2025
## Problem
FLAKY TESTS IDS: DD_0CHY3W DD_NMRUZV DD_UPYTSO DD_9ICU9D DD_084DFM 
The `test_subprocess.py` tests were flaky, failing intermittently with
specific random seeds (e.g., `--randomly-seed=1219310116`). The failures
manifested in two ways:
1. **`Pin.get_from(os)` returning `None`**: Tests failed with
`AttributeError: 'NoneType' object has no attribute '_clone'`
2. **Incorrect span counts**: Tests expecting 1 span received 2 spans,
or vice versa
The root cause was **inadequate cleanup between tests**, leading to
state pollution that caused subsequent tests to fail depending on
execution order.
## Root Cause Analysis
Through systematic investigation with the problematic random seed, we
identified **six distinct issues**:
### 1. Missing `os.fork` in `unpatch()` ❌
- `os.fork` was patched but never unpatched
- Left wrapped functions persisting between tests
- **Impact**: Stale wrappers with outdated closure state
### 2. Missing Pin Removal in `unpatch()` ❌
- Pin objects were never removed from `os` and `subprocess` modules
- **Impact**: Tests expecting no Pin would find leftover Pins from
previous tests
### 3. Test Couldn't Handle None Pin ❌
- When `asm_config._load_modules=False`, `patch()` returns early without
setting a Pin
- Tests blindly called `Pin.get_from(os)._clone()` causing
AttributeError
- **Impact**: Immediate test failure
### 4. `test_unpatch` Tried to Use Removed Pin ❌
- After proper unpatch cleanup, Pin is removed
- Test tried to get and clone a non-existent Pin
- **Impact**: Test failure after fixing issue #2
### 5. Incomplete Fixture Cleanup ❌
- `auto_unpatch` fixture only cleaned up **after** tests
- State from previous tests polluted subsequent tests
- **Impact**: Test failures dependent on execution order
## Testing
Before Fix
 ```
$ scripts/run-tests --venv 15fbf61 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
# Multiple failures:
# - test_ossystem_disabled[config1][py3.10]: AttributeError: 'NoneType'
object has no attribute '_clone'
# - test_ossystem_disabled[config1][py3.10]: AssertionError: assert 2 ==
1 (unexpected spans)
# - test_unpatch[py3.11]: AttributeError: 'NoneType' object has no
attribute '_clone'
 ```
          
After Fix
 ```
$ scripts/run-tests --venv 15fbf61 --venv 7ed64b0 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
 ```

✅ Python 3.10: 488/488 tests passed
✅ Python 3.11: 485/485 tests passed
Verification
- ✅ Tests pass with the problematic random seed on both Python 3.10 and
3.11
- ✅ Tests pass in isolation
- ✅ Tests pass without random seed
- ✅ No regression in other test suites
---
Related Issues
Fixes flaky tests reported with random seed --randomly-seed=1219310116:
- test_ossystem_disabled[config1][py3.10] - Pin.get_from returns None
- test_ossystem_disabled[config1][py3.10] - Getting 2 spans instead of 1
- test_unpatch[py3.11] - 'NoneType' object has no attribute '_clone'
dd-octo-sts bot pushed a commit that referenced this pull request Dec 19, 2025
## Problem
FLAKY TESTS IDS: DD_0CHY3W DD_NMRUZV DD_UPYTSO DD_9ICU9D DD_084DFM
The `test_subprocess.py` tests were flaky, failing intermittently with
specific random seeds (e.g., `--randomly-seed=1219310116`). The failures
manifested in two ways:
1. **`Pin.get_from(os)` returning `None`**: Tests failed with
`AttributeError: 'NoneType' object has no attribute '_clone'`
2. **Incorrect span counts**: Tests expecting 1 span received 2 spans,
or vice versa
The root cause was **inadequate cleanup between tests**, leading to
state pollution that caused subsequent tests to fail depending on
execution order.
## Root Cause Analysis
Through systematic investigation with the problematic random seed, we
identified **six distinct issues**:
### 1. Missing `os.fork` in `unpatch()` ❌
- `os.fork` was patched but never unpatched
- Left wrapped functions persisting between tests
- **Impact**: Stale wrappers with outdated closure state
### 2. Missing Pin Removal in `unpatch()` ❌
- Pin objects were never removed from `os` and `subprocess` modules
- **Impact**: Tests expecting no Pin would find leftover Pins from
previous tests
### 3. Test Couldn't Handle None Pin ❌
- When `asm_config._load_modules=False`, `patch()` returns early without
setting a Pin
- Tests blindly called `Pin.get_from(os)._clone()` causing
AttributeError
- **Impact**: Immediate test failure
### 4. `test_unpatch` Tried to Use Removed Pin ❌
- After proper unpatch cleanup, Pin is removed
- Test tried to get and clone a non-existent Pin
- **Impact**: Test failure after fixing issue #2
### 5. Incomplete Fixture Cleanup ❌
- `auto_unpatch` fixture only cleaned up **after** tests
- State from previous tests polluted subsequent tests
- **Impact**: Test failures dependent on execution order
## Testing
Before Fix
 ```
$ scripts/run-tests --venv 15fbf61 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
# Multiple failures:
# - test_ossystem_disabled[config1][py3.10]: AttributeError: 'NoneType'
object has no attribute '_clone'
# - test_ossystem_disabled[config1][py3.10]: AssertionError: assert 2 ==
1 (unexpected spans)
# - test_unpatch[py3.11]: AttributeError: 'NoneType' object has no
attribute '_clone'
 ```

After Fix
 ```
$ scripts/run-tests --venv 15fbf61 --venv 7ed64b0 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
 ```

✅ Python 3.10: 488/488 tests passed
✅ Python 3.11: 485/485 tests passed
Verification
- ✅ Tests pass with the problematic random seed on both Python 3.10 and
3.11
- ✅ Tests pass in isolation
- ✅ Tests pass without random seed
- ✅ No regression in other test suites
---
Related Issues
Fixes flaky tests reported with random seed --randomly-seed=1219310116:
- test_ossystem_disabled[config1][py3.10] - Pin.get_from returns None
- test_ossystem_disabled[config1][py3.10] - Getting 2 spans instead of 1
- test_unpatch[py3.11] - 'NoneType' object has no attribute '_clone'

(cherry picked from commit fa5ce13)
avara1986 added a commit that referenced this pull request Dec 19, 2025
## Problem
FLAKY TESTS IDS: DD_0CHY3W DD_NMRUZV DD_UPYTSO DD_9ICU9D DD_084DFM
The `test_subprocess.py` tests were flaky, failing intermittently with
specific random seeds (e.g., `--randomly-seed=1219310116`). The failures
manifested in two ways:
1. **`Pin.get_from(os)` returning `None`**: Tests failed with
`AttributeError: 'NoneType' object has no attribute '_clone'`
2. **Incorrect span counts**: Tests expecting 1 span received 2 spans,
or vice versa
The root cause was **inadequate cleanup between tests**, leading to
state pollution that caused subsequent tests to fail depending on
execution order.
## Root Cause Analysis
Through systematic investigation with the problematic random seed, we
identified **six distinct issues**:
### 1. Missing `os.fork` in `unpatch()` ❌
- `os.fork` was patched but never unpatched
- Left wrapped functions persisting between tests
- **Impact**: Stale wrappers with outdated closure state
### 2. Missing Pin Removal in `unpatch()` ❌
- Pin objects were never removed from `os` and `subprocess` modules
- **Impact**: Tests expecting no Pin would find leftover Pins from
previous tests
### 3. Test Couldn't Handle None Pin ❌
- When `asm_config._load_modules=False`, `patch()` returns early without
setting a Pin
- Tests blindly called `Pin.get_from(os)._clone()` causing
AttributeError
- **Impact**: Immediate test failure
### 4. `test_unpatch` Tried to Use Removed Pin ❌
- After proper unpatch cleanup, Pin is removed
- Test tried to get and clone a non-existent Pin
- **Impact**: Test failure after fixing issue #2
### 5. Incomplete Fixture Cleanup ❌
- `auto_unpatch` fixture only cleaned up **after** tests
- State from previous tests polluted subsequent tests
- **Impact**: Test failures dependent on execution order
## Testing
Before Fix
 ```
$ scripts/run-tests --venv 15fbf61 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
# Multiple failures:
# - test_ossystem_disabled[config1][py3.10]: AttributeError: 'NoneType'
object has no attribute '_clone'
# - test_ossystem_disabled[config1][py3.10]: AssertionError: assert 2 ==
1 (unexpected spans)
# - test_unpatch[py3.11]: AttributeError: 'NoneType' object has no
attribute '_clone'
 ```

After Fix
 ```
$ scripts/run-tests --venv 15fbf61 --venv 7ed64b0 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
 ```

✅ Python 3.10: 488/488 tests passed
✅ Python 3.11: 485/485 tests passed
Verification
- ✅ Tests pass with the problematic random seed on both Python 3.10 and
3.11
- ✅ Tests pass in isolation
- ✅ Tests pass without random seed
- ✅ No regression in other test suites
---
Related Issues
Fixes flaky tests reported with random seed --randomly-seed=1219310116:
- test_ossystem_disabled[config1][py3.10] - Pin.get_from returns None
- test_ossystem_disabled[config1][py3.10] - Getting 2 spans instead of 1
- test_unpatch[py3.11] - 'NoneType' object has no attribute '_clone'

(cherry picked from commit fa5ce13)
Signed-off-by: Alberto Vara <[email protected]>
avara1986 added a commit that referenced this pull request Dec 19, 2025
Backport fa5ce13 from #15724 to 4.1.

## Problem
FLAKY TESTS IDS: DD_0CHY3W DD_NMRUZV DD_UPYTSO DD_9ICU9D DD_084DFM 
The `test_subprocess.py` tests were flaky, failing intermittently with
specific random seeds (e.g., `--randomly-seed=1219310116`). The failures
manifested in two ways:
1. **`Pin.get_from(os)` returning `None`**: Tests failed with
`AttributeError: 'NoneType' object has no attribute '_clone'`
2. **Incorrect span counts**: Tests expecting 1 span received 2 spans,
or vice versa
The root cause was **inadequate cleanup between tests**, leading to
state pollution that caused subsequent tests to fail depending on
execution order.
## Root Cause Analysis
Through systematic investigation with the problematic random seed, we
identified **six distinct issues**:
### 1. Missing `os.fork` in `unpatch()` ❌
- `os.fork` was patched but never unpatched
- Left wrapped functions persisting between tests
- **Impact**: Stale wrappers with outdated closure state
### 2. Missing Pin Removal in `unpatch()` ❌
- Pin objects were never removed from `os` and `subprocess` modules
- **Impact**: Tests expecting no Pin would find leftover Pins from
previous tests
### 3. Test Couldn't Handle None Pin ❌
- When `asm_config._load_modules=False`, `patch()` returns early without
setting a Pin
- Tests blindly called `Pin.get_from(os)._clone()` causing
AttributeError
- **Impact**: Immediate test failure
### 4. `test_unpatch` Tried to Use Removed Pin ❌
- After proper unpatch cleanup, Pin is removed
- Test tried to get and clone a non-existent Pin
- **Impact**: Test failure after fixing issue #2
### 5. Incomplete Fixture Cleanup ❌
- `auto_unpatch` fixture only cleaned up **after** tests
- State from previous tests polluted subsequent tests
- **Impact**: Test failures dependent on execution order
## Testing
Before Fix
 ```
$ scripts/run-tests --venv 15fbf61 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
# Multiple failures:
# - test_ossystem_disabled[config1][py3.10]: AttributeError: 'NoneType'
object has no attribute '_clone'
# - test_ossystem_disabled[config1][py3.10]: AssertionError: assert 2 ==
1 (unexpected spans)
# - test_unpatch[py3.11]: AttributeError: 'NoneType' object has no
attribute '_clone'
 ```
          
After Fix
 ```
$ scripts/run-tests --venv 15fbf61 --venv 7ed64b0 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
 ```

✅ Python 3.10: 488/488 tests passed
✅ Python 3.11: 485/485 tests passed
Verification
- ✅ Tests pass with the problematic random seed on both Python 3.10 and
3.11
- ✅ Tests pass in isolation
- ✅ Tests pass without random seed
- ✅ No regression in other test suites
---
Related Issues
Fixes flaky tests reported with random seed --randomly-seed=1219310116:
- test_ossystem_disabled[config1][py3.10] - Pin.get_from returns None
- test_ossystem_disabled[config1][py3.10] - Getting 2 spans instead of 1
- test_unpatch[py3.11] - 'NoneType' object has no attribute '_clone'

Signed-off-by: Alberto Vara <[email protected]>
Co-authored-by: Alberto Vara <[email protected]>
brettlangdon pushed a commit that referenced this pull request Jan 6, 2026
## Problem
FLAKY TESTS IDS: DD_0CHY3W DD_NMRUZV DD_UPYTSO DD_9ICU9D DD_084DFM 
The `test_subprocess.py` tests were flaky, failing intermittently with
specific random seeds (e.g., `--randomly-seed=1219310116`). The failures
manifested in two ways:
1. **`Pin.get_from(os)` returning `None`**: Tests failed with
`AttributeError: 'NoneType' object has no attribute '_clone'`
2. **Incorrect span counts**: Tests expecting 1 span received 2 spans,
or vice versa
The root cause was **inadequate cleanup between tests**, leading to
state pollution that caused subsequent tests to fail depending on
execution order.
## Root Cause Analysis
Through systematic investigation with the problematic random seed, we
identified **six distinct issues**:
### 1. Missing `os.fork` in `unpatch()` ❌
- `os.fork` was patched but never unpatched
- Left wrapped functions persisting between tests
- **Impact**: Stale wrappers with outdated closure state
### 2. Missing Pin Removal in `unpatch()` ❌
- Pin objects were never removed from `os` and `subprocess` modules
- **Impact**: Tests expecting no Pin would find leftover Pins from
previous tests
### 3. Test Couldn't Handle None Pin ❌
- When `asm_config._load_modules=False`, `patch()` returns early without
setting a Pin
- Tests blindly called `Pin.get_from(os)._clone()` causing
AttributeError
- **Impact**: Immediate test failure
### 4. `test_unpatch` Tried to Use Removed Pin ❌
- After proper unpatch cleanup, Pin is removed
- Test tried to get and clone a non-existent Pin
- **Impact**: Test failure after fixing issue #2
### 5. Incomplete Fixture Cleanup ❌
- `auto_unpatch` fixture only cleaned up **after** tests
- State from previous tests polluted subsequent tests
- **Impact**: Test failures dependent on execution order
## Testing
Before Fix
 ```
$ scripts/run-tests --venv 15fbf61 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
# Multiple failures:
# - test_ossystem_disabled[config1][py3.10]: AttributeError: 'NoneType'
object has no attribute '_clone'
# - test_ossystem_disabled[config1][py3.10]: AssertionError: assert 2 ==
1 (unexpected spans)
# - test_unpatch[py3.11]: AttributeError: 'NoneType' object has no
attribute '_clone'
 ```
          
After Fix
 ```
$ scripts/run-tests --venv 15fbf61 --venv 7ed64b0 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
 ```

✅ Python 3.10: 488/488 tests passed
✅ Python 3.11: 485/485 tests passed
Verification
- ✅ Tests pass with the problematic random seed on both Python 3.10 and
3.11
- ✅ Tests pass in isolation
- ✅ Tests pass without random seed
- ✅ No regression in other test suites
---
Related Issues
Fixes flaky tests reported with random seed --randomly-seed=1219310116:
- test_ossystem_disabled[config1][py3.10] - Pin.get_from returns None
- test_ossystem_disabled[config1][py3.10] - Getting 2 spans instead of 1
- test_unpatch[py3.11] - 'NoneType' object has no attribute '_clone'
KowalskiThomas added a commit that referenced this pull request Jan 6, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree 
#1   0x00007f9a7accc6b5 pthread_create 
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread 
#3   0x00007f9a7a639d18 PeriodicThread_start 
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork 
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod 
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall 
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.
dd-octo-sts bot pushed a commit that referenced this pull request Jan 6, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree
#1   0x00007f9a7accc6b5 pthread_create
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread
#3   0x00007f9a7a639d18 PeriodicThread_start
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

(cherry picked from commit 4c69fdd)
KowalskiThomas added a commit that referenced this pull request Jan 6, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree
#1   0x00007f9a7accc6b5 pthread_create
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread
#3   0x00007f9a7a639d18 PeriodicThread_start
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

(cherry picked from commit 4c69fdd)
dd-octo-sts bot pushed a commit that referenced this pull request Jan 6, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree
#1   0x00007f9a7accc6b5 pthread_create
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread
#3   0x00007f9a7a639d18 PeriodicThread_start
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

(cherry picked from commit 4c69fdd)
KowalskiThomas added a commit that referenced this pull request Jan 6, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree
#1   0x00007f9a7accc6b5 pthread_create
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread
#3   0x00007f9a7a639d18 PeriodicThread_start
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

(cherry picked from commit 4c69fdd)
KowalskiThomas added a commit that referenced this pull request Jan 7, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree
#1   0x00007f9a7accc6b5 pthread_create
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread
#3   0x00007f9a7a639d18 PeriodicThread_start
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

(cherry picked from commit 4c69fdd)
KowalskiThomas added a commit that referenced this pull request Jan 7, 2026
…15858)

Backport 4c69fdd from #15798 to 4.1.

## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree 
#1   0x00007f9a7accc6b5 pthread_create 
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread 
#3   0x00007f9a7a639d18 PeriodicThread_start 
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork 
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod 
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall 
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

Co-authored-by: Thomas Kowalski <[email protected]>
KowalskiThomas added a commit that referenced this pull request Jan 7, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree
#1   0x00007f9a7accc6b5 pthread_create
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread
#3   0x00007f9a7a639d18 PeriodicThread_start
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

(cherry picked from commit 4c69fdd)
KowalskiThomas added a commit that referenced this pull request Jan 7, 2026
…15861)

Backport 4c69fdd from #15798 to 4.0.

## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree 
#1   0x00007f9a7accc6b5 pthread_create 
#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread 
#3   0x00007f9a7a639d18 PeriodicThread_start 
#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
#6   0x00007f9a7ad17073 __fork 
#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
#32  0x00007f9a7b065c25 PyObject_VectorcallMethod 
#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
#35  0x00007f9a7b039358 PyObject_Vectorcall 
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.

Co-authored-by: Thomas Kowalski <[email protected]>
kianjones9 pushed a commit to kianjones9/dd-trace-py that referenced this pull request Jan 9, 2026
## Problem
FLAKY TESTS IDS: DD_0CHY3W DD_NMRUZV DD_UPYTSO DD_9ICU9D DD_084DFM 
The `test_subprocess.py` tests were flaky, failing intermittently with
specific random seeds (e.g., `--randomly-seed=1219310116`). The failures
manifested in two ways:
1. **`Pin.get_from(os)` returning `None`**: Tests failed with
`AttributeError: 'NoneType' object has no attribute '_clone'`
2. **Incorrect span counts**: Tests expecting 1 span received 2 spans,
or vice versa
The root cause was **inadequate cleanup between tests**, leading to
state pollution that caused subsequent tests to fail depending on
execution order.
## Root Cause Analysis
Through systematic investigation with the problematic random seed, we
identified **six distinct issues**:
### 1. Missing `os.fork` in `unpatch()` ❌
- `os.fork` was patched but never unpatched
- Left wrapped functions persisting between tests
- **Impact**: Stale wrappers with outdated closure state
### 2. Missing Pin Removal in `unpatch()` ❌
- Pin objects were never removed from `os` and `subprocess` modules
- **Impact**: Tests expecting no Pin would find leftover Pins from
previous tests
### 3. Test Couldn't Handle None Pin ❌
- When `asm_config._load_modules=False`, `patch()` returns early without
setting a Pin
- Tests blindly called `Pin.get_from(os)._clone()` causing
AttributeError
- **Impact**: Immediate test failure
### 4. `test_unpatch` Tried to Use Removed Pin ❌
- After proper unpatch cleanup, Pin is removed
- Test tried to get and clone a non-existent Pin
- **Impact**: Test failure after fixing issue DataDog#2
### 5. Incomplete Fixture Cleanup ❌
- `auto_unpatch` fixture only cleaned up **after** tests
- State from previous tests polluted subsequent tests
- **Impact**: Test failures dependent on execution order
## Testing
Before Fix
 ```
$ scripts/run-tests --venv 15fbf61 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
# Multiple failures:
# - test_ossystem_disabled[config1][py3.10]: AttributeError: 'NoneType'
object has no attribute '_clone'
# - test_ossystem_disabled[config1][py3.10]: AssertionError: assert 2 ==
1 (unexpected spans)
# - test_unpatch[py3.11]: AttributeError: 'NoneType' object has no
attribute '_clone'
 ```
          
After Fix
 ```
$ scripts/run-tests --venv 15fbf61 --venv 7ed64b0 -- --
tests/contrib/subprocess/test_subprocess.py --randomly-seed=1219310116
 ```

✅ Python 3.10: 488/488 tests passed
✅ Python 3.11: 485/485 tests passed
Verification
- ✅ Tests pass with the problematic random seed on both Python 3.10 and
3.11
- ✅ Tests pass in isolation
- ✅ Tests pass without random seed
- ✅ No regression in other test suites
---
Related Issues
Fixes flaky tests reported with random seed --randomly-seed=1219310116:
- test_ossystem_disabled[config1][py3.10] - Pin.get_from returns None
- test_ossystem_disabled[config1][py3.10] - Getting 2 spans instead of 1
- test_unpatch[py3.11] - 'NoneType' object has no attribute '_clone'
kianjones9 pushed a commit to kianjones9/dd-trace-py that referenced this pull request Jan 9, 2026
## Description

https://datadoghq.atlassian.net/browse/PROF-13112

This is an attempt to address the following crash. There seems to be a
case (that I wasn't able to reproduce in a Docker image, but maybe my
"code environment" didn't match the customer's exactly) where using
`uvloop` results in a crash caused by `PeriodicThread_start` after
`uvloop` tries to restart Threads after a fork.

```
#0   0x00007f9a7acdbefa cfree 
DataDog#1   0x00007f9a7accc6b5 pthread_create 
DataDog#2   0x00007f9a7a63aaa5 std::thread::_M_start_thread 
DataDog#3   0x00007f9a7a639d18 PeriodicThread_start 
DataDog#4   0x00007f9a2e71d565 __pyx_f_6uvloop_4loop_9UVProcess__after_fork (uvloop/loop.c:120214:3)
DataDog#5   0x00007f9a2e6369a8 __pyx_f_6uvloop_4loop___get_fork_handler (uvloop/loop.c:163075:24)
DataDog#6   0x00007f9a7ad17073 __fork 
DataDog#7   0x00007f9a2e732d62 uv__spawn_and_init_child_fork (src/unix/process.c:831:10)
DataDog#8   0x00007f9a2e732d62 uv__spawn_and_init_child (src/unix/process.c:919:9)
DataDog#9   0x00007f9a2e732d62 uv_spawn (src/unix/process.c:1013:18)
DataDog#10  0x00007f9a2e71fb87 __pyx_f_6uvloop_4loop_9UVProcess__init (uvloop/loop.c:119056:19)
DataDog#11  0x00007f9a2e711bf7 __pyx_f_6uvloop_4loop_18UVProcessTransport_new (uvloop/loop.c:126866:16)
DataDog#12  0x00007f9a2e712aa7 __pyx_gb_6uvloop_4loop_4Loop_116generator16 (uvloop/loop.c:54030:28)
DataDog#13  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
DataDog#14  0x00007f9a2e699f8a __Pyx_Coroutine_AmSend (uvloop/loop.c:196492:18)
DataDog#15  0x00007f9a2e69a052 __Pyx_Coroutine_Yield_From_Coroutine (uvloop/loop.c:197380:14)
DataDog#16  0x00007f9a2e69b0e5 __Pyx_Coroutine_Yield_From (uvloop/loop.c:197408:16)
DataDog#17  0x00007f9a2e69b0e5 __pyx_gb_6uvloop_4loop_4Loop_122generator18 (uvloop/loop.c:55002:15)
DataDog#18  0x00007f9a2e631419 __Pyx_Coroutine_SendEx (uvloop/loop.c:196315:14)
DataDog#19  0x00007f9a2e69bb86 __Pyx_Generator_Next (uvloop/loop.c:196581:18)
DataDog#20  0x00007f9a2e6398eb __Pyx_PyObject_Call (uvloop/loop.c:191431:15)
DataDog#21  0x00007f9a2e6398eb __Pyx_PyObject_FastCallDict (uvloop/loop.c:191552:16)
DataDog#22  0x00007f9a2e715a69 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66873:27)
DataDog#23  0x00007f9a2e71996b __pyx_f_6uvloop_4loop_4Loop__on_idle (uvloop/loop.c:17975:25)
DataDog#24  0x00007f9a2e713e52 __pyx_f_6uvloop_4loop_6Handle__run (uvloop/loop.c:66927:24)
DataDog#25  0x00007f9a2e715c88 __pyx_f_6uvloop_4loop_cb_idle_callback (uvloop/loop.c:87335:19)
DataDog#26  0x00007f9a2e731311 uv__run_idle (unix/loop-watcher.c:68:1)
DataDog#27  0x00007f9a2e72e647 uv_run (src/unix/core.c:439:5)
DataDog#28  0x00007f9a2e64fdb5 __pyx_f_6uvloop_4loop_4Loop__Loop__run (uvloop/loop.c:18458:23)
DataDog#29  0x00007f9a2e6b7e50 __pyx_f_6uvloop_4loop_4Loop__run (uvloop/loop.c:18876:18)
DataDog#30  0x00007f9a2e6c8cf0 __pyx_pf_6uvloop_4loop_4Loop_24run_forever (uvloop/loop.c:31528:18)
DataDog#31  0x00007f9a2e6c8cf0 __pyx_pw_6uvloop_4loop_4Loop_25run_forever (uvloop/loop.c:31331:13)
DataDog#32  0x00007f9a7b065c25 PyObject_VectorcallMethod 
DataDog#33  0x00007f9a2e6ccd60 __pyx_pf_6uvloop_4loop_4Loop_44run_until_complete (uvloop/loop.c:33768:23)
DataDog#34  0x00007f9a2e6ce591 __pyx_pw_6uvloop_4loop_4Loop_45run_until_complete (uvloop/loop.c:33318:13)
DataDog#35  0x00007f9a7b039358 PyObject_Vectorcall 
```

## Fix

The `_after_fork` boolean field marks that this thread object is in a
"post-fork zombie state." When the flag is set to true, Thread methods
(e.g. `join`) become no-ops because the threads do not exist anymore so
we should not try to do something with them. By checking that same flag,
we can tell that we are trying to start a Thread that doesn't really
exist and so we shouldn't try to do it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants