OpenMPI version 3.0 and node oversubscription #182

jrper · 2018-03-14T18:11:10Z

It see that OpenMPI version 3.0 has changed the default behaviour regarding oversubscription and mpirun. On this version, a plain call like mpirun -np 8 python -c "print('Hello')" will fail with an MPI error on systems with fewer than 8 slots available. This includes several of the Fluidity tests, which can request anything up to 16 processes.

The "right way" to do things is now to specify mpirun -np 8 -oversubscribe python -c "print('Hello')", on specific calls, or to define an environment variable, OMPI_MCA_rmaps_base_oversubscribe=1 mpirun -np 8 python -c "print('Hello')" to get back to the openmpi 2.0 behaviour. I'm not sure what the right thing to do for us is, possibly to use the last option inside of test harness itself?

The text was updated successfully, but these errors were encountered:

Patol75 · 2018-03-15T01:44:32Z

From another GitHub project, they ran into the same kind of error (dealii/dealii#5123) and fixed it using the environment variable method (dealii/dealii#5142).
Another idea could be to have the test cases adaptive, in the sense that they would ask the system for the number of available processors, with nproc --all for example, and adjust the number of requested MPI processes accordingly. But this would imply having to modify quite a few test cases xml files, with probably nprocs not being treated as an attribute any more (whenever nprocs > 1) but rather included into the command line element such as:

#!/bin/bash
nprocs=16
nprocsLocal=$(nproc --all)
if (( nprocsLocal >= nprocs )); then
  # All fine
else
  # Either oversubscribe or use the maximum number of procs available
fi

Or even better, it could stay an attribute if its value is known at runtime (I have no idea if true or not). The xml modification could be achieved with a simple enough lxml Python script.

stephankramer · 2018-05-09T16:23:35Z

Seems to me using the environment variable, setting it in testharness would be the simplest solution indeed. As @Patol75 's link suggests, we can just always set it, and it shouldn't hurt the case where we don't use openmpi 3.0

As for the other suggestions: I'm afraid it's a little more complicated:

the tests don't run sequentially. Testharness runs multiple tests in paralel, with the maximum number of tests being run set by the -threads option (or use make THREADS=8 test). This is essential as many tests are serial, so we need this to get a decent turn around time on multi-core system.
if the tests themselves are parallel already, testharness does not take that into account - i.e. what it doesn't do is add up the specified nprocs for the tests such that the number of requested cores of all concurrently running tests doesn't exceed the number of threads specified as the testharness command line option. Instead it simply oversubscribes. So if you run with -threads=4 and it happens to pick 4 tests that each have nprocs=8, it will be using 32 cores. This could be made more clever, but would require to implement some kind of scheduling mechanism (and there's various ways to go about that). Since the simple strategy of just overscribing worked reasonably well, we haven't bothered doing anything more sophisticated
making the tests such that they can run on an arbitrary number of processors is another can of worms. In principle it can be done, but it would indeed be rather invasive and you will find a lot of corner cases, for instance if it tries to run a very small problem (many tests have trivially small meshes) on too many cores, some tests actually test flredecomping between different number of cores, etc. etc.

Apply the change proposed in #182 to allow oversubscription of MPI slots in openmpi 3.0 and above.

jrper · 2018-06-13T09:39:40Z

Closed by #199, through added environment variable in the testharness.

Core oversubscription should be switched on for tests through testharness (see issue #182) but doesn't appear to be working, so explicitly turning it on in the base images.

jrper added a commit that referenced this issue May 14, 2018

Update testharness.py

830d81c

Apply the change proposed in #182 to allow oversubscription of MPI slots in openmpi 3.0 and above.

jrper mentioned this issue May 14, 2018

Update testharness.py #199

Merged

stephankramer mentioned this issue May 23, 2018

CI needs upgrading #200

Closed

jrper added a commit that referenced this issue Jun 6, 2018

Update testharness.py for oversubscription

3a8b567

Apply the change proposed in #182 to allow oversubscription of MPI slots in openmpi 3.0 and above.

tmbgreaves pushed a commit that referenced this issue Jun 8, 2018

Update testharness.py for oversubscription

feaa436

Apply the change proposed in #182 to allow oversubscription of MPI slots in openmpi 3.0 and above.

jrper added a commit that referenced this issue Jun 13, 2018

Update testharness.py for oversubscription

45276a2

Apply the change proposed in #182 to allow oversubscription of MPI slots in openmpi 3.0 and above.

jrper closed this as completed Jun 13, 2018

stephankramer mentioned this issue Apr 7, 2021

Bugfixing gcc10 #308

Closed

tmbgreaves added a commit that referenced this issue Apr 7, 2021

DOCKER: Explicitly allow core oversubscription

5340406

Core oversubscription should be switched on for tests through testharness (see issue #182) but doesn't appear to be working, so explicitly turning it on in the base images.

tmbgreaves added a commit that referenced this issue Apr 7, 2021

DOCKER: Explicitly allow core oversubscription

eb3acb6

Core oversubscription should be switched on for tests through testharness (see issue #182) but doesn't appear to be working, so explicitly turning it on in the base images.

tmbgreaves added a commit that referenced this issue Apr 7, 2021

DOCKER: Explicitly allow core oversubscription

3d08241

Core oversubscription should be switched on for tests through testharness (see issue #182) but doesn't appear to be working, so explicitly turning it on in the base images.

tmbgreaves added a commit that referenced this issue Apr 7, 2021

DOCKER: Explicitly allow core oversubscription

b44dc99

Core oversubscription should be switched on for tests through testharness (see issue #182) but doesn't appear to be working, so explicitly turning it on in the base images.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenMPI version 3.0 and node oversubscription #182

OpenMPI version 3.0 and node oversubscription #182

jrper commented Mar 14, 2018

Patol75 commented Mar 15, 2018 •

edited

Loading

stephankramer commented May 9, 2018

jrper commented Jun 13, 2018

OpenMPI version 3.0 and node oversubscription #182

OpenMPI version 3.0 and node oversubscription #182

Comments

jrper commented Mar 14, 2018

Patol75 commented Mar 15, 2018 • edited Loading

stephankramer commented May 9, 2018

jrper commented Jun 13, 2018

Patol75 commented Mar 15, 2018 •

edited

Loading