Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libomp tests on s390x sometimes extremely slow #116215

Open
nikic opened this issue Nov 14, 2024 · 2 comments
Open

libomp tests on s390x sometimes extremely slow #116215

nikic opened this issue Nov 14, 2024 · 2 comments
Labels
openmp:libomp OpenMP host runtime

Comments

@nikic
Copy link
Contributor

nikic commented Nov 14, 2024

We've observed that running the openmp tests on s390x is sometimes extremely slow, for example they ran for more than 8 hours here:

Slowest Tests:
--------------------------------------------------------------------------
30238.12s: libomp :: worksharing/for/omp_for_collapse.c
25670.15s: libomp :: env/kmp_set_dispatch_buf.c
14940.87s: libomp :: worksharing/single/omp_single.c
14356.12s: libomp :: tasking/omp_taskloop_num_tasks.c
13150.09s: libomp :: worksharing/for/omp_for_schedule_runtime.c
8455.10s: libomp :: worksharing/for/kmp_set_dispatch_buf.c
8144.37s: libomp :: tasking/omp_taskloop_grainsize.c
6818.04s: libomp :: threadprivate/omp_threadprivate.c
5520.38s: libomp :: worksharing/for/omp_doacross.c
3407.81s: libomp :: tasking/omp_task_priority3.c
2711.39s: libomp :: worksharing/for/omp_collapse_many_int.c
2667.53s: libomp :: tasking/task_teams_stress_test.cpp
2137.94s: libomp :: parallel/omp_parallel_num_threads.c
2049.00s: libomp :: worksharing/for/omp_for_reduction.c
1983.83s: libomp :: worksharing/for/omp_for_ordered.c
1845.35s: libomp :: tasking/issue-87307.c
1842.58s: libomp :: atomic/omp_atomic.c
1697.35s: libomp :: worksharing/for/omp_parallel_for_ordered.c
1672.86s: libomp :: worksharing/sections/omp_sections_reduction.c
1604.87s: libomp :: parallel/omp_parallel_reduction.c
Tests Times:
--------------------------------------------------------------------------
[     Range     ] :: [               Percentage               ] :: [ Count ]
--------------------------------------------------------------------------
[30000s,32000s) :: [                                        ] :: [  1/389]
[28000s,30000s) :: [                                        ] :: [  0/389]
[26000s,28000s) :: [                                        ] :: [  1/389]
[24000s,26000s) :: [                                        ] :: [  0/389]
[22000s,24000s) :: [                                        ] :: [  0/389]
[20000s,22000s) :: [                                        ] :: [  0/389]
[18000s,20000s) :: [                                        ] :: [  0/389]
[16000s,18000s) :: [                                        ] :: [  0/389]
[14000s,16000s) :: [                                        ] :: [  2/389]
[12000s,14000s) :: [                                        ] :: [  1/389]
[10000s,12000s) :: [                                        ] :: [  0/389]
[ 8000s,10000s) :: [                                        ] :: [  2/389]
[ 6000s, 8000s) :: [                                        ] :: [  1/389]
[ 4000s, 6000s) :: [                                        ] :: [  1/389]
[ 2000s, 4000s) :: [                                        ] :: [  6/389]
[    0s, 2000s) :: [**************************************  ] :: [374/389]
--------------------------------------------------------------------------
Testing Time: 30668.21s
Total Discovered Tests: 399
  Excluded         :  10 (2.51%)
  Unsupported      :  11 (2.76%)
  Passed           : 377 (94.49%)
  Expectedly Failed:   1 (0.25%)

We've only observed this issue on 32-core configurations in particular. Here is the hardware information for one of them:

CPU info:
Architecture:        s390x
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Big Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s) per book:  1
Book(s) per drawer:  1
Drawer(s):           32
NUMA node(s):        1
Vendor ID:           IBM/S390
Machine type:        8561
CPU dynamic MHz:     5200
CPU static MHz:      5200
BogoMIPS:            3241.00
Hypervisor:          z/VM 7.2.0
Hypervisor vendor:   IBM
Virtualization type: full
Dispatching mode:    horizontal
L1d cache:           128K
L1i cache:           128K
L2d cache:           4096K
L2i cache:           4096K
L3 cache:            262144K
L4 cache:            983040K
NUMA node0 CPU(s):   0-31
Flags:               esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te vx vxd vxe gs vxe2 vxp sort dflt sie


Memory:
              total        used        free      shared  buff/cache   available
Mem:      104721188     1315028    90630200     4161828    12775960    98299752
Swap:       4194300       83168     4111132
@nikic nikic added the openmp:libomp OpenMP host runtime label Nov 14, 2024
@nikic
Copy link
Contributor Author

nikic commented Nov 14, 2024

@uweigand @JonPsson1 Does this sound familiar to you at all?

Generally I've also observed that s390x has a lot more spurious libomp failures than other architectures (both in our own builds and on the public buildbots). It makes me worry that there is some significant synchronization issue in the implementation that manifests in regular spurious test failures.

@uweigand
Copy link
Member

Thanks for pointing this out. I just realized that I forgot to set up a mail notifier for the openmp-s390x-linux build bot, so I had actually not seen those failures. I've corrected this oversight now. I'll have a look into analysing those failures soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

No branches or pull requests

2 participants