Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] fix be exit probelm #44162

Merged
merged 1 commit into from
Apr 17, 2024
Merged

Conversation

before-Sunrise
Copy link
Contributor

@before-Sunrise before-Sunrise commented Apr 16, 2024

Why I'm doing:

introduced by #42351

https://github.com/StarRocks/StarRocksTest/issues/7060 one of its' problem is be can not exit gracefully:

==9909==ERROR: AddressSanitizer: heap-use-after-free on address 0x615000e7fdec at pc 0x0000118ba70c bp 0x7fff44cd15b0 sp 0x7fff44cd15a8
WRITE of size 4 at 0x615000e7fdec thread T0
    #0 0x118ba70b in decltype (::new ((void*)(0)) int((declval<int const&>)())) std::construct_at<int, int const&>(int*, int const&) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_construct.h:97
    #1 0x118ba745 in void std::allocator_traits<std::allocator<int> >::construct<int, int const&>(std::allocator<int>&, int*, int const&) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/alloc_traits.h:514
    #2 0x118b492f in std::deque<int, std::allocator<int> >::push_back(int const&) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:1498
    #3 0x118b6000 in starrocks::CoreLocalValueController<long>::reclaim_id(int) /root/starrocks/be/src/util/core_local.h:85
    #4 0x118add97 in starrocks::CoreLocalValue<long>::~CoreLocalValue() /root/starrocks/be/src/util/core_local.h:125
    #5 0x1191faf6 in starrocks::CoreLocalCounter<long>::~CoreLocalCounter() /root/starrocks/be/src/util/metrics.h:169
    #6 0x1c72240f in starrocks::ThreadPool::~ThreadPool() /root/starrocks/be/src/util/threadpool.cpp:255
    #7 0x11de1535 in std::default_delete<starrocks::ThreadPool>::operator()(starrocks::ThreadPool*) const /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/unique_ptr.h:85
    #8 0x11dd7e5a in std::unique_ptr<starrocks::ThreadPool, std::default_delete<starrocks::ThreadPool> >::~unique_ptr() /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/unique_ptr.h:361
    #9 0x11dca181 in starrocks::ExecEnv::~ExecEnv() /root/starrocks/be/src/runtime/exec_env.h:253
    #10 0x7fef34759ce8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8)
    #11 0x7fef34759d36 in exit (/lib64/libc.so.6+0x39d36)
    #12 0x7fef3474255b in __libc_start_main (/lib64/libc.so.6+0x2255b)
    #13 0x11796028  (/home/disk1/sr/be/lib/starrocks_be+0x11796028)

0x615000e7fdec is located 492 bytes inside of 512-byte region [0x615000e7fc00,0x615000e7fe00)
freed by thread T0 here:
    #0 0x1181e1e7 in operator delete(void*) ../../.././libsanitizer/asan/asan_new_delete.cpp:160
    #1 0x118be39f in __gnu_cxx::new_allocator<int>::deallocate(int*, unsigned long) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/ext/new_allocator.h:133
    #2 0x118b0ed8 in std::allocator<int>::deallocate(int*, unsigned long) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/allocator.h:187
    #3 0x118b0ed8 in std::allocator_traits<std::allocator<int> >::deallocate(std::allocator<int>&, int*, unsigned long) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/alloc_traits.h:492
    #4 0x118c04db in std::_Deque_base<int, std::allocator<int> >::_M_deallocate_node(int*) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:566
    #5 0x118c012e in std::_Deque_base<int, std::allocator<int> >::_M_destroy_nodes(int**, int**) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:676
    #6 0x118ba59b in std::_Deque_base<int, std::allocator<int> >::~_Deque_base() /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:598
    #7 0x118bbe28 in std::deque<int, std::allocator<int> >::~deque() /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:1004
    #8 0x118b5e73 in starrocks::CoreLocalValueController<long>::~CoreLocalValueController() /root/starrocks/be/src/util/core_local.h:70
    #9 0x7fef34759ce8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8)

previously allocated by thread T0 here:
    #0 0x1181d817 in operator new(unsigned long) ../../.././libsanitizer/asan/asan_new_delete.cpp:99
    #1 0x118c6b1a in __gnu_cxx::new_allocator<int>::allocate(unsigned long, void const*) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/ext/new_allocator.h:115
    #2 0x118bb5d6 in std::allocator<int>::allocate(unsigned long) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/allocator.h:173
    #3 0x118bb5d6 in std::allocator_traits<std::allocator<int> >::allocate(std::allocator<int>&, unsigned long) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/alloc_traits.h:460
    #4 0x118c0621 in std::_Deque_base<int, std::allocator<int> >::_M_allocate_node() /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:559
    #5 0x118ba7dc in void std::deque<int, std::allocator<int> >::_M_push_back_aux<int const&>(int const&) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/deque.tcc:494
    #6 0x118b497d in std::deque<int, std::allocator<int> >::push_back(int const&) /opt/rh/gcc-toolset-10/root/usr/include/c++/10.3.0/bits/stl_deque.h:1503
    #7 0x118b6000 in starrocks::CoreLocalValueController<long>::reclaim_id(int) /root/starrocks/be/src/util/core_local.h:85
    #8 0x118add97 in starrocks::CoreLocalValue<long>::~CoreLocalValue() /root/starrocks/be/src/util/core_local.h:125
    #9 0x1191faf6 in starrocks::CoreLocalCounter<long>::~CoreLocalCounter() /root/starrocks/be/src/util/metrics.h:169
    #10 0x118a0301 in starrocks::StarRocksMetrics::~StarRocksMetrics() /root/starrocks/be/src/util/starrocks_metrics.h:89
    #11 0x7fef34759ce8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8)

p (ThreadPool)(0x61600010d780)

$3 = {_name = {static npos = 18446744073709551615,
    _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>},
      _M_p = 0x60300037d0b0 "automatic_partition"}, _M_string_length = 19, {
      _M_local_buf = "\023\000\000\000\000\000\000\000\276\276\276\276\276\276\276\276", _M_allocated_capacity = 19}}, _min_threads = 0,
  _max_threads = {<std::__atomic_base<int>> = {static _S_alignment = 4, _M_i = 8}, static is_always_lock_free = true}, _max_queue_size = 1000,
  _idle_timeout = {static kUninitialized = -9223372036854775808, nano_delta_ = 2000000000}, _pool_status = {_state = 0x604004235f50 "\034"},

What I'm doing:

becase ThreadPool has CoreLocalCounter as member variable, so every ThreadPool have to be released before CoreLocalValueController which is a static single instance. I just reset two ThreadPool which use smart pointer in ExecEnv::destory, so these two thread pools will be released before CoreLocalValueController

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

Signed-off-by: before-Sunrise <[email protected]>
@before-Sunrise before-Sunrise requested a review from a team as a code owner April 16, 2024 08:50
@kevincai
Copy link
Contributor

what PR introduced this exit core?

@before-Sunrise
Copy link
Contributor Author

what PR introduced this exit core?

added in description

Copy link
Contributor

@kevincai kevincai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trueeyu how can we add some force checks on this thread pool cleanup so it can be found at the earliest.

Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 2 / 2 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/runtime/exec_env.cpp 2 2 100.00% []

@trueeyu trueeyu merged commit 84a6e96 into StarRocks:main Apr 17, 2024
80 checks passed
@before-Sunrise
Copy link
Contributor Author

@mergify backport branch-3.3

Copy link
Contributor

mergify bot commented Apr 17, 2024

backport branch-3.3

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Apr 17, 2024
Signed-off-by: before-Sunrise <[email protected]>
(cherry picked from commit 84a6e96)
trueeyu pushed a commit that referenced this pull request Apr 17, 2024
before-Sunrise added a commit to before-Sunrise/starrocks that referenced this pull request Apr 17, 2024
trueeyu pushed a commit that referenced this pull request Apr 17, 2024
leiyang0324 pushed a commit to leiyang0324/starrocks that referenced this pull request Apr 19, 2024
wanpengfei-git pushed a commit that referenced this pull request May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants