Upgrade mpi4py to >4 #184
Conversation
There was a problem hiding this comment.
Sounds reasonable. NDSL now logs a warning
<frozen importlib._bootstrap>:241: RuntimeWarning: suspicious MPI execution environment
Your environment has OMPI_COMM_WORLD_SIZE=6 set, but mpi4py was built with MPICH.
You may be using `mpiexec` or `mpirun` from a different MPI implementation.
NDSL is not running mpi test anymore (all of them are skipped). Might be related to issue #178.
pace tests complain about the wrong number of ranks. Something seems off ... Might be in the CI configuration, but suspicious nevertheless
Also, pyFV3 is not running "Orchestrated dace-cpu Accoustics" anymore (and thus really, really fast)
|
It sounds like
pip install mpichsomewhere in the pipeline and start from a default ubuntu runner. As mentioned on pypi, these wheels are designed for ease of use rather than maximal speed, but we are talking about GitHub Actions here, not performance runs on the cluster. Imo worth a shot as a potential simplification step. |
@FlorianDeconinck and @romanc , I too think we should try to move away from the container for now and just use an Ubuntu runner. If we agree this is a solution I will make a quick PR in pace to change the workflow environment. |
|
Go for it |
|
A quick test with the mpi tests in NDSL (see here) shows that this seems to be valuable approach. I choose to install the MPICH flavor (as was in the image before). Interestingly, there's apparently no |
This is a follow-up from NOAA-GFDL/NDSL#184 similar to NOAA-GFDL/pace#131.
* ci: remove container, install mpi from python wheel This is a follow-up from NOAA-GFDL/NDSL#184 similar to NOAA-GFDL/pace#131. * Delete weird/unrecongnized mpiexec argument
The current work to run at scale on Grace Hopper unified silicon ARM64 hardware as showcased the need to move
mpi4pyabove 4x.The original restriction to
3.1.5was done out of an abundance of caution in a moment where halo exchange were questioned.Alas, the new allocator that ships with cuda 12.5 for GH boxes especially trips the old
mpi4py. We need to move up.The risk is limited as
4.1showcase a very large amount of fixes to main version 4.