Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on inserts #156

Closed
dave11ar opened this issue May 16, 2023 · 8 comments
Closed

Segmentation fault on inserts #156

dave11ar opened this issue May 16, 2023 · 8 comments

Comments

@dave11ar
Copy link

dave11ar commented May 16, 2023

When I run my benchmarks on processors with big amount of cores:

  • Kunpeng-920 with 2 NUMA nodes of 24 cores each
  • Intel Xeon Gold 6338 with 2 NUMA nodes of 24 cores each

I got segmentation fault on 196 line in bucket_container.hh (bucket_container operator[])
I was able to reproduce this with slightly modified (change initial capacity of tables from g_numkeys to 0) stress tests from your repo by running stress_checked.cc with keys
--power 16 --thread-num 48 --time 15 --disable-deletes --disable-updates --disable-finds

This issue is for v0.3.1 tag

@manugoyal
Copy link
Contributor

Hi @dave11ar, thanks for reporting!

I tried running your repro on my own machine (details below), but failed to trigger any crashes. I tried 100 runs with your exact parameterization, and 100 runs with 20 threads, which is the number of CPUs on my machine. If you're able to provide any more reproduction or debugging tips, I'd be happy to try digging further.

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          20
On-line CPU(s) list:             0-19
Thread(s) per core:              2
Core(s) per socket:              10
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           79
Model name:                      Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Stepping:                        1
CPU MHz:                         1198.020
CPU max MHz:                     3100.0000
CPU min MHz:                     1200.0000
BogoMIPS:                        4389.76
Virtualization:                  VT-x
L1d cache:                       320 KiB
L1i cache:                       320 KiB
L2 cache:                        2.5 MiB
L3 cache:                        25 MiB
NUMA node0 CPU(s):               0-19

@Explosiontime202
Copy link

Explosiontime202 commented Jun 26, 2023

I have reproduced the described bug.

Here's the backtrace
0x000000000049dc02 in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::slot_search<std::integral_constant<bool, false> > (this=0x4e22b0, hp=10, i1=102, i2=233) at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:1648
1648	        if (!b.occupied(slot)) {
(gdb) backtrace
#0  0x000000000049dc02 in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::slot_search<std::integral_constant<bool, false> > (this=0x4e22b0, hp=10, i1=102, i2=233)
    at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:1648
#1  0x00000000004981af in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::cuckoopath_search<std::integral_constant<bool, false> > (this=0x4e22b0, hp=10, cuckoo_path=..., i1=102, i2=233)
    at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:1419
#2  0x0000000000491e0b in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::run_cuckoo<std::integral_constant<bool, false> > (this=0x4e22b0, b=..., insert_bucket=@0x7ffff5199b18: 0, insert_slot=@0x7ffff5199b10: 0)
    at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:1381
#3  0x000000000048c84f in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::cuckoo_insert<std::integral_constant<bool, false>, unsigned int> (this=0x4e22b0, hv=..., b=..., key=@0x7ffff5199d6c: 62566)
    at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:1259
#4  0x000000000048875a in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::cuckoo_insert_loop<std::integral_constant<bool, false>, unsigned int> (this=0x4e22b0, hv=..., b=..., key=@0x7ffff5199d6c: 62566)
    at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:1197
#5  0x0000000000487b3d in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::uprase_fn<unsigned int&, libcuckoo::internal::UpsertToUpraseFn<libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::insert<unsigned int&, unsigned int&>(unsigned int&, unsigned int&)::{lambda(unsigned int&)#1}, unsigned int, false>, unsigned int&>(unsigned int&, libcuckoo::internal::UpsertToUpraseFn<libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::insert<unsigned int&, unsigned int&>(unsigned int&, unsigned int&)::{lambda(unsigned int&)#1}, unsigned int, false>, unsigned int&) (this=0x4e22b0, 
    key=@0x7ffff5199d6c: 62566, fn=...) at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:558
#6  0x0000000000483f66 in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::upsert<unsigned int&, libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::insert<unsigned int&, unsigned int&>(unsigned int&, unsigned int&)::{lambda(unsigned int&)#1}, unsigned int&>(unsigned int&, libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::insert<unsigned int&, unsigned int&>(unsigned int&, unsigned int&)::{lambda(unsigned int&)#1}, unsigned int&) (this=0x4e22b0, key=@0x7ffff5199d6c: 62566, fn=...) at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:588
#7  0x00000000004801d3 in libcuckoo::cuckoohash_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, 4ul>::insert<unsigned int&, unsigned int&> (this=0x4e22b0, key=@0x7ffff5199d6c: 62566) at /home/johannes/Documents/module/imlab/project/libcuckoo/libcuckoo/cuckoohash_map.hh:646
#8  0x000000000047bff8 in stress_insert_thread<unsigned int> (env=0x4e22b0) at /home/johannes/Documents/module/imlab/project/libcuckoo/tests/stress-tests/stress_checked.cc:118
#9  0x00000000004a9fe0 in std::__invoke_impl<void, void (*)(AllEnvironment<unsigned int>*), AllEnvironment<unsigned int>*> (
    __f=@0x503310: 0x47bef2 <stress_insert_thread<unsigned int>(AllEnvironment<unsigned int>*)>) at /usr/include/c++/13/bits/invoke.h:61
#10 0x00000000004a999b in std::__invoke<void (*)(AllEnvironment<unsigned int>*), AllEnvironment<unsigned int>*> (
    __fn=@0x503310: 0x47bef2 <stress_insert_thread<unsigned int>(AllEnvironment<unsigned int>*)>) at /usr/include/c++/13/bits/invoke.h:96
#11 0x00000000004a9169 in std::thread::_Invoker<std::tuple<void (*)(AllEnvironment<unsigned int>*), AllEnvironment<unsigned int>*> >::_M_invoke<0ul, 1ul> (this=0x503308)
    at /usr/include/c++/13/bits/std_thread.h:292
#12 0x00000000004a8d3e in std::thread::_Invoker<std::tuple<void (*)(AllEnvironment<unsigned int>*), AllEnvironment<unsigned int>*> >::operator() (this=0x503308)
    at /usr/include/c++/13/bits/std_thread.h:299
#13 0x00000000004a8c22 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(AllEnvironment<unsigned int>*), AllEnvironment<unsigned int>*> > >::_M_run (this=0x503300)
    at /usr/include/c++/13/bits/std_thread.h:244
#14 0x00007ffff7ce31f3 in std::execute_native_thread_routine (__p=0x503300) at ../../../../../libstdc++-v3/src/c++11/thread.cc:104
#15 0x00007ffff7aae907 in start_thread (arg=<optimized out>) at pthread_create.c:444
#16 0x00007ffff7b34870 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
And my CPU specs:
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  20
  On-line CPU(s) list:   0-19
Vendor ID:               GenuineIntel
  Model name:            12th Gen Intel(R) Core(TM) i7-12700H
    CPU family:          6
    Model:               154
    Thread(s) per core:  2
    Core(s) per socket:  14
    Socket(s):           1
    Stepping:            3
    CPU(s) scaling MHz:  35%
    CPU max MHz:         4700.0000
    CPU min MHz:         400.0000
    BogoMIPS:            5376.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss 
                         ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
                          arch_perfmon pebs bts rep_good nopl xtopology nonstop_
                         tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes6
                         4 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xt
                         pr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_
                         timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefet
                         ch cpuid_fault epb cat_l2 cdp_l2 ssbd ibrs ibpb stibp i
                         brs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_
                         ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
                          rdt_a rdseed adx smap clflushopt clwb intel_pt sha_ni 
                         xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vn
                         ni dtherm ida arat pln pts hwp hwp_notify hwp_act_windo
                         w hwp_epp hwp_pkg_req hfi umip pku ospke waitpkg gfni v
                         aes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear se
                         rialize arch_lbr ibt flush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   544 KiB (14 instances)
  L1i:                   704 KiB (14 instances)
  L2:                    11.5 MiB (8 instances)
  L3:                    24 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-19
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional
                         , RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

If you need more information from me, please reach out.

@manugoyal
Copy link
Contributor

Thanks @Explosiontime202. It seems we have very similar machines, but I'm still having trouble reproducing the crash. A few questions.

  • Can you share the command you ran to reproduce? Was it ./tests/stress-tests/stress_checked --power 16 --thread-num 48 --time 15 --disable-deletes --disable-updates --disable-finds?
  • Are you running in debug mode (your backtrace seems quite detailed)? Would you mind just sharing your whole compilation command? I was building using CMake so it was
cd /home/manu/third_party/libcuckoo/build/tests/stress-tests && /usr/bin/c++   -I/home/manu/third_party/libcuckoo/tests -I/home/manu/third_party/libcuckoo/build/tests -I/home/manu/third_party/libcuckoo  -g   -pthread -o CMakeFiles/stress_checked.dir/stress_checked.cc.o -c /home/manu/third_party/libcuckoo/tests/stress-tests/stress_checked.cc
cd /home/manu/third_party/libcuckoo/build/tests/stress-tests && /usr/bin/cmake -E cmake_link_script CMakeFiles/stress_checked.dir/link.txt --verbose=1
/usr/bin/c++  -g  -rdynamic CMakeFiles/stress_checked.dir/stress_checked.cc.o  -o stress_checked  -pthread

Where my compiler version is

manu@manu-desktop:~/third_party/libcuckoo/build$ c++ --version
c++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@ivafanas
Copy link

Hi,

We are facing the same issue on x86_64 and e2k architectures. This cmd line is usually enough to reproduce the issue:

for i in 1 2 3 4 5 6 7 8 9 10; do "universal_benchmark" "--inserts" "100" "--initial-capacity" "4" "--total-ops" "13107200" || break; done

Test finishes with Segmentation fault message.

@manugoyal
Copy link
Contributor

manugoyal commented Nov 25, 2024

Thanks @ivafanas! I was able to get your repro to segfault on an x86_64 ec2 machine, with a stack trace that resembles reports from above! Will dig into it.

Update: bisected to 95e558d

@manugoyal
Copy link
Contributor

Hi @ivafanas, @Explosiontime202, @dave11ar. I put up a fix for the segfault in 91b9c2d. I was able to reproduce the issue on an ec2 machine (using @ivafanas's repro) and verified that it no longer shows up over hundreds of runs.

If you have a chance to run on your own and verify that would be great. Thanks for reporting!

@ivafanas
Copy link

Hi,

We are testing libcuckoo regularly on our hardware for a month after the fix and there is no crash anymore.
Seems like it is fixed.

Thank you!

@manugoyal
Copy link
Contributor

That's so great to hear @ivafanas, thank you! Will mark as closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants