-
Notifications
You must be signed in to change notification settings - Fork 476
fix(profiling): fix data race when accessing span for thread [backport 2.12] #11215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sanchda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.12 is a bit old, but this is a potential segfault and 2.12 is one of those releases where Profiling has specifically directed customers toward.
The overall code looks fine, but I don't know whether CI will pass. IMHO, if this doesn't pass and it isn't a priority to fix 2.12 CI for the tracer team, then maybe it isn't critical to push on this. 🤷
sanchda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.12 is a bit old, but this is a potential segfault and 2.12 is one of those releases where Profiling has specifically directed customers toward.
The overall code looks fine, but I don't know whether CI will pass. IMHO, if this doesn't pass and it isn't a priority to fix 2.12 CI for the tracer team, then maybe it isn't critical to push on this. 🤷
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The `get_active_span_from_thread_id` member function
returns a pointer to the active span for a thread. The `link_span`
member function sets the active span for a thread.
`get_active_span_from_thread_id` accesses the map of spans under a
mutex, but returns the pointer after releasing the mutex, meaning
`link_span` can modify the members of the Span while the caller of
`get_active_span_from_thread_id` is reading them.
Fix this by returning a copy of the `Span`. Use a `std::optional` to wrap
the return value of `get_active_span_from_thread_id`, rather than
returning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.
I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:
```
WARNING: ThreadSanitizer: data race (pid=2971510)
Read of size 8 at 0x7b2000004080 by thread T2:
#0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
#1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
#2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (libstdc++.so.6+0x1432b4)
#3 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::__invoke_impl<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__invoke_other, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe46e)
#4 std::__invoke_result<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>::type std::__invoke<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*&&)()) <null> (thread_span_links+0xe2fe)
#5 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe1cf)
#6 std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> >::operator()() <null> (thread_span_links+0xe0f6)
#7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (*)()> > >::_M_run() <null> (thread_span_links+0xdf40)
#8 <null> <null> (libstdc++.so.6+0xd6df3)
Previous write of size 8 at 0x7b2000004080 by thread T1 (mutexes: write M47):
#0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:823 (libtsan.so.0+0x42313)
#1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:815 (libtsan.so.0+0x42313)
#2 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocato
r<char> > const&) <null> (libstdc++.so.6+0x1432b4)
#3 get() <null> (thread_span_links+0xb570)
#4 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void (*&&)()) <null> (thread_span_links+0xe525)
#5 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void (*&&)()) <null> (thread_span_links+0xe3b5)
#6 void std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (thread_span_links+0xe242)
#7 std::thread::_Invoker<std::tuple<void (*)()> >::operator()() <null> (thread_span_links+0xe158)
[ ... etc ... ]
```
(cherry picked from commit 64b3374)
83c5570 to
000d9f1
Compare
Datadog ReportBranch report: ✅ 0 Failed, 3714 Passed, 46 Skipped, 1h 34m 32.79s Total duration (8m 3.9s time saved) |
Backport 64b3374 from #11167 to 2.12.
The ThreadSpanLinks singleton holds the active span (if one exists) for
a given thread ID. The
get_active_span_from_thread_idmember functionreturns a pointer to the active span for a thread. The
link_spanmember function sets the active span for a thread.
get_active_span_from_thread_idaccesses the map of spans under amutex, but returns the pointer after releasing the mutex, meaning
link_spancan modify the members of the Span while the caller ofget_active_span_from_thread_idis reading them.Fix this by returning a copy of the
Span. Use astd::optionalto wrapthe return value of
get_active_span_from_thread_id, rather thanreturning a pointer. We want to tell whether or not there actually was a
span associated with the thread, but returning a pointer would require
us to heap allocate the copy of the Span.
I added a simplistic regression test which fails reliably without this
fix when built with the thread sanitizer enabled. Output like:
(cherry picked from commit 64b3374)
Checklist
Reviewer Checklist