Skip to content

Conversation

@JKLiang9714
Copy link

No description provided.

@it-is-a-robot
Copy link

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

The following commits have not yet signed CLA.

6d735ba | A complete overhaul of the HAN code.

Among many other things:

  • Fix an imbalance bug in MPI_allgather
  • Accept more human readable configuration files. We can now specify
    the collective by name instead of a magic number, and the component
    we want to use also by name.
  • Add the capability to have optional arguments in the collective
    communication configuration file. Right now the capability exists
    for segment lengths, but is yet to be connected with the algorithms.
  • Redo the initialization of all HAN collectives.

Cleanup the fallback collective support.

  • In case the module is unable to deliver the expected result, it will fallback
    executing the collective operation on another collective component. This change
    make the support for this fallback simpler to use.
  • Implement a fallback allowing a HAN module to remove itself as
    potential active collective module, and instead fallback to the
    next module in line.
  • Completely disable the HAN modules on error. From the moment an error is
    encountered they remove themselves from the communicator, and in case some
    other modules calls them simply behave as a pass-through.

Communicator: provide ompi_comm_split_with_info to split and provide info at the same time
Add ompi_comm_coll_preference info key to control collective component selection

COLL HAN: use info keys instead of component-level variable to communicate topology level between abstraction layers

  • The info value is a comma-separated list of entries, which are chosen with
    decreasing priorities. This overrides the priority of the component,
    unless the component has disqualified itself.
    An entry prefixed with ^ starts the ignore-list. Any entry following this
    character will be ingnored during the collective component selection for the
    communicator.
    Example: "sm,libnbc,^han,adapt" gives sm the highest preference, followed
    by libnbc. The components han and adapt are ignored in the selection process.
  • Allocate a temporary buffer for all lower-level leaders (length 2 segments)
  • Fix the handling of MPI_IN_PLACE for gather and scatter.

COLL HAN: Fix topology handling

  • HAN should not rely on node names to determine the ordering of ranks.
    Instead, use the node leaders as identifiers and short-cut if the
    node-leaders agree that ranks are consecutive. Also, error out if
    the rank distribution is imbalanced for now.

Signed-off-by: Xi Luo [email protected]
Signed-off-by: Joseph Schuchart [email protected]
Signed-off-by: George Bosilca [email protected]

Conflicts:
ompi/mca/coll/adapt/coll_adapt_ibcast.c
1264b67 | Fix partial packing of non data elements.

There was a bug allowing for partial packing of non-data elements (such as loop
and end_loop markers) during the exit condition of a pack/unpack call. This has
basically no meaning. Prevent this bug from happening by making sure the element
point to a data before trying to partially pack it.

Signed-off-by: George Bosilca [email protected]
3518ebf | pml/ucx: fix zero sized datatype transfers

Signed-off-by: Aboorva Devarajan [email protected]
(cherry picked from commit 202b81d)
9d4e3b1 | Fix HAN issues reported by Coverity.

Signed-off-by: George Bosilca [email protected]
087a672 | Merge pull request open-mpi#7945 from bosilca/4.1/han

Import the HAN collective into 4.1
96ba8b7 | Merge pull request open-mpi#8142 from AboorvaDevarajan/fix_zero_byte_v4.1.x

[v4.1.x] pml/ucx: fix zero sized datatype transfers
fa3211a | opal_functions.m4: remove redundant code

This code was invoked twice. Leave it solely in OPAL_CONFIGURE_SETUP,
which is invoked before OPAL_BASIC_SETUP.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 7c36b45)
35e7d86 | configury: make build Reproducible

If defined, use SOURCE_DATE_EPOCH environment variable; make the build
Reproducible by forcing timestamps. See
https://reproducible-builds.org/docs/source-date-epoch/ for more
information.

Thanks Bernhard M. Wiedemann for bringing this to our attention.

Fixes open-mpi#3759

NOTE: This was cherry-picked from master, and slightly modified /
amended for the v4.1.x branch.

Signed-off-by: Gilles Gouaillardet [email protected]
Signed-off-by: Bernhard M. Wiedemann [email protected]
Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 7b4e8ba)
234356a | configure.ac: Add workaround on MacOS for "readlink -f"

MacOS does not have "readlink -f" or "realpath", so use the
MacOS-provided Python, which we know has os.path.realpath().

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit ddf216b)
e7f829b | getdate.sh: make the date(1) usage more portable

There are several different flavors of date(1) out there. Try a few
different CLI options for date(1) to see which one works.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 89920ba)
09b8736 | OPAL: fix string buffer allocation for large env variables

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 320a9a1)
be86f87 | Update Internal PMIx to OpenPMIx v3.2.1rc1

Signed-off-by: Joshua Hursey [email protected]
60ee133 | Disable man pages for internal OpenPMIx

Signed-off-by: Joshua Hursey [email protected]
55af867 | coll/adapt and coll/han: fix trivial compiler warnings

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit ee405cc)
bed022d | Merge pull request open-mpi#8155 from devreal/fix_opal_add_to_env_str_alloc_v4.1.x

OPAL: fix string buffer allocation for large env variables [v4.1.x]
391ae58 | Merge pull request open-mpi#8160 from jsquyres/pr/v4.1.x/trivial-coll-and-han-warning-fixes

v4.1.x: coll/adapt and coll/han: fix trivial compiler warnings
19c863b | keyval_parse.c: ensure to init values

Coverity complained about uninitialized variables; ensure that they
are initialized to 0 in all cases.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit eac0ab5)
c10a85e | keyval_parse.c: update whitespace/comments

Slightly improve comments and update some whitespace.

No code or logic changes.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 8ed1d28)
71ee907 | Merge pull request open-mpi#8164 from jsquyres/pr/v4.1.x/keyval-parse-tweaks

v4.1.x: Keyval parse tweaks
8c1d830 | Merge pull request open-mpi#8148 from jsquyres/pr/v4.1.x/reproducible-build

v4.1.x: reproducible builds + portability fix
40e104d | Merge pull request open-mpi#8123 from jjhursey/v4.1-pmix-v3.2

v4.1.x: Update Internal PMIx to OpenPMIx v3.2.1rc1
8133adf | NEWS: More updates for v4.1.0

Signed-off-by: Jeff Squyres [email protected]
6bb3ef4 | Merge pull request open-mpi#8172 from jsquyres/pr/v4.1.0/NEWS-updates

NEWS: More updates for v4.1.0
c12540c | op/avx: check for _mm512_mullo_epi64() AVX512 intrinsic

PGI (20.4) compiler do not define this intrinsic, so only build
AVX512 support if _mm512_mullo_epi64() intrisic is defined.

Signed-off-by: Gilles Gouaillardet [email protected]
(cherry picked from commit 26e42f9)
c85d591 | config/Makefile.am: ensure getdate.sh is in dist tarball

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 91a5af8)
ab86c27 | opal_functions.m4: add comment

No code or logic changes.

Add commit about why it's ok to use $srcdir here
(vs. $OMPI_TOP_SRCDIR).

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit a6a0d51)
fd39daa | Merge pull request open-mpi#8188 from jsquyres/pr/v4.1.x/getdate-fixes

v4.1.x: Some getdate.sh fixes
59a47c2 | PML/UCX: improved error processing in MPI_Recv

  • improved error processing in MPI_Recv implementation
    of pml UCX
  • added error handling for pml_ucx_mrecv call

Signed-off-by: Sergey Oblomov [email protected]
(cherry picked from commit eb9405d)

Conflicts:
ompi/mca/pml/ucx/pml_ucx.c
bed064f | Merge pull request open-mpi#8181 from ggouaillardet/topic/v4.1.x/avx512_pgi

op/avx: check for _mm512_mullo_epi64() AVX512 intrinsic
16d8894 | orterun.1in: fix minor mistake in :PE=2 example

Fix mistake in orterun(1) (i.e., mpirun(1)) with an example using the
:PE=x modifier. Additionally, add some extra text with some further
explanation.

This is not a cherry-pick from master because PRRTE has replaced ORTE
on master, and orterun.1in no longer exists in master.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 7384972)
405dc6e | orterun.1in: define "slot" and "processor element"

Add descriptive definitions of "slot" and "processor element" at the
top of the man page (and effectively delete / move some text from
lower in the man page up into those definitions).

Also add a little blurb in the --use-hwthread-cpus description about how
it changes the definition of "processor element".

This is not a cherry-pick from master because PRRTE has replaced ORTE
on master, and orterun.1in no longer exists in master.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 07b8937)
df73e4a | orterun.1in: add some markup

Add some nroff markup into the paragraph, just to clearly delineate
the option names from the paragraph text. No other content changes.

This is not a cherry-pick from master because PRRTE has replaced ORTE
on master, and orterun.1in no longer exists in master.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 25f84be)
74a743f | Merge pull request open-mpi#8192 from jsquyres/pr/v4.1.x/fix-minor-mistake-in-mpirun.1in

v4.1.x: orterun.1in: fix minor mistake in :PE=2 example and add more descriptions/explanations
3f863aa | v4.1.x: Using package_rank to select between NIC of equal distance from the process.

If PMIX_PACKAGE_RANK is available, uses this value to select between multiple
NIC of equal distance between the current process. If this value is not
available, try to calculate it by getting the locality string from each local
process and assign a package_rank. If everything fails, fall back to using
process_id.rank to select the NIC. This last case is not ideal, but has a small
chance of occuring, and causes an output to be displayed to notify that this is
occuring.

Some of the information in master branch is not available for the multi-NIC
patch, such as myprocinfo.rank. This info is used to select between multiple
NIC of equal distance to the process. This adapts the previous commit to work
with the v4.1.x branch.

Signed-off-by: Nikola Dancejic [email protected]
(cherry picked from commit 8017f12)
ec35893 | Correct computation of relative locality

Ensure we always pass the cpuset as well as the locality string for each
proc. Correct the mtl/ofi component's computation of relative locality
as the function being called expects to be given the locality string of
each proc, not the cpuset. If the locality string of the current proc
isn't available, then use the cpuset if available and compute the
locality before trying to compute relative localities of our peers.

Signed-off-by: Ralph Castain [email protected]
a9ede52 | coll/tuned: add hint about dynamic rules to mca parameters

The mca parameters coll_tuned_*_algorithm are ignored unless coll_tuned_use_dynamic_rules is true so mention that in the description.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 06f605c)
4a3f2af | coll/tuned: Mark global static algorithm as const

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 7261255)
5e19de8 | coll/tuned: don't select algorithms knowing when it's clear they would fall back to linear

Bcast: scatter_allgather and scatter_allgather_ring expect N_elem >= N_procs
Allreduce: rabenseifner expects N_elem >= pow2 nearest to N_procs

In all cases, the implementations will fall back to a linear implementation,
which will most likely yield the worst performance (noted for 4B bcast on 128 ranks)

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 04d198f)
ff89195 | Merge pull request open-mpi#8176 from dancejic/multi-v4.1.x

v4.1.x: Using package_rank to select between NIC of equal distance from the process
aec55f1 | coll/tuned: fix minor errors in comments

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 22e289b)
3cae9f7 | COLL TUNED: remove stray selection of linear algs for alreduce and allgather

These selections seem harmful in my measurements and don't seem to be
motivated by previous measurement data.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit a15e5dc)
59a6a4d | v4.1.x: Update Internal PMIx to OpenPMIx v3.2.1

Signed-off-by: Joshua Hursey [email protected]
6f21a39 | Merge pull request open-mpi#8198 from devreal/fix-tuned-dynamic-v4.1.x

Fix some issues with dynamic algorithm selection in coll/tuned
203a930 | mtl/ofi: Check cq_data_size without querying providers again

This commit removes the unnecessary call to fi_getinfo() when
initializing the MTL. cq_data_size is a domain attribute that will be
available to the MTL from the initial query itself. FI_DIRECTED_RECV is
a primary capability that has to be requested for a provider to enable
it, so adding that to the initial requirement. The redundant query was
also overwriting the contents of the prov object, which already had the
include/exclude filtering and multi-NIC logic applied to it.

Signed-off-by: Raghu Raja [email protected]
(cherry picked from commit 6233dea)
fa83779 | Merge pull request open-mpi#8202 from jjhursey/v4.1-pmix-3.2.1

v4.1.x: Update Internal PMIx to OpenPMIx v3.2.1
093570a | Merge pull request open-mpi#8205 from rajachan/whack-remote-cq-data-query-41x

[v4.1.x] mtl/ofi: Check cq_data_size without querying providers again
9c36c28 | coll/hcoll: scatterv inplace fix

Signed-off-by: Valentin Petrov [email protected]
(cherry picked from commit 9fa0015)
4993a09 | Correctly skip the "mpirun" node when launching orted on it

Mark the node as "unusable" so it does not get included when computing
number of procs for the case where the user does not specify -np.

Signed-off-by: Ralph Castain [email protected]
dd3a00a | Merge pull request open-mpi#8211 from vspetrov/v4.1.x

V4.1.x coll/hcoll: svatterv inplace fix
129b5ee | coll/base: do not drop const qualifier

MPI_Ialltoallw() and friends take a const MPI_Datatype types[] argument.
In order to be able to call OBJ_RELEASE(types[0]), we used to simply
drop the const modifier. This change make it right by introducing the
OBJ_RELEASE_NO_NULLIFY(object) macro that no more set object = NULL
if the object is freed.

Signed-off-by: Gilles Gouaillardet [email protected]
(cherry picked from commit c49e5e5)
ac2f54f | Merge pull request open-mpi#8190 from hoopoepg/topic/pml-ucx-recv-improved-errhandling-v4.1

PML/UCX: improved error processing in MPI_Recv - v4.1
c614c54 | Merge pull request open-mpi#8216 from rhc54/cmr41/rmps

v4.1.x: Correctly skip the "mpirun" node when launching orted on it
b299b49 | COLL TUNED: Use per-rank data size instead of total size for decision

The total size depends on number of ranks so the usual ranges don't work.
Thus, use the average across all ranks to make a decision.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit f670364)
3d422d1 | Merge pull request open-mpi#8224 from devreal/fix-tuned-allgatherv-v4.1.x

COLL TUNED: Use per-rank data size instead of total size for decision [4.1.x]
870c2d7 | oshmem/tools/oshmem_info: fix an issue with fortran keyword when compiling param.c

Signed-off-by: Pak Lui [email protected]
(cherry picked from commit 3cdead0)
ab530bf | Merge pull request open-mpi#8228 from jsquyres/pr/pak-lui-v4.1.x-fixup

v4.1.x: oshmem/tools/oshmem_info: fix an issue with fortran keyword when comp…
0a0a15a | Remove PMIx man page setup

There are no manpages in v3.2.
Port of openpmix/openpmix#1930

Signed-off-by: Ralph Castain [email protected]
(cherry picked from commit 7b11693)
35f8fbc | Merge pull request open-mpi#8235 from rhc54/cmr41/px

Remove PMIx man page setup
9f228c9 | coll/base: Fix collective module selection preference treatment

The selectable list is sorted with lowest to highest priority so the
user-defined preferences should be appended to the list.
The preference treatment should also maintain the order provided by the user
(first item has highest priority) so switch the loop order.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit dd54af9)
bcf70a2 | coll/[sm|han|adapt]: don't disqualify on priority 0

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 09c2f4a)
9a202ea | coll/han: remove references to experimental solo and shared collective components

Also make coll/tuned the default for shared memory communication
as coll/sm has shown performance issues that need investigation.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 971d58c)
2acf40c | coll/han: reduce default segment size for reduce/allreduce to 64k

This has shown to be more effective in achieving overlap
of inter- and intra-node communication and reduces the inital
delay before hitting the network.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 1cdc855)
4c0c0e9 | Merge pull request open-mpi#8237 from devreal/fix-coll-base-preference-v4.1.x

Fix preference treatment in coll/base [v4.1.x]
de354ea | OSC RDMA: put memory of each process into separate pages

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit d11ccba)
2e1e9dc | OSC RDMA: only touch pages before memory registration, don't fill them

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 52b52b8)
33aa639 | configury: fix OPAL_GET_VERSION

  • fix path to getdate.sh
  • do not prepend "date" to the revision
  • support git worktree

Signed-off-by: Gilles Gouaillardet [email protected]
(cherry picked from commit 930d3c4)
c28e166 | configury: fix typos

This is a one-off commit for the release branches that fixes
some typos introduced when backporting
open-mpi/ompi@35e7d86

Signed-off-by: Gilles Gouaillardet [email protected]
534aeac | autogen.pl: patch libtool.m4 for OSX Big Sur

Thanks FX Coudert for reporting this issue and pointing
to a solution.

Refs. open-mpi#8218

Signed-off-by: Gilles Gouaillardet [email protected]
Signed-off-by: Jeff Squyres [email protected]

(back-ported from commit open-mpi/ompi@3f45ced)
2c91509 | Merge pull request open-mpi#8238 from devreal/osc-page-align-v4.1.x

OSC RDMA: put memory for each process into separate pages [4.1.x]
390045e | Merge pull request open-mpi#8240 from ggouaillardet/topic/v4.1.x/reproducibility_fixes

v4.1.x: configury reproducibility fixes
d09771c | Merge pull request open-mpi#8241 from ggouaillardet/topic/v4.1.x/libtool_bigsur

v4.1.x: autogen.pl: patch libtool.m4 for OSX Big Sur
f9e2bf7 | Fix many compiler warnings

Fixes open-mpi#8195. This PR doesn't fix all the warnings from open-mpi#8195, but
fixes many of them (e.g., I didn't get the "string might be truncated"
warnings on my Mac).

This is an adaptation of 14aa5fa from
master; it drops some things that aren't relevant here on the v4.1.x
branch and adds a few more warnings fixes that are relevant here on
v4.1.x that aren't relevant on master.

Signed-off-by: Jeff Squyres [email protected]
(cherry-picked from 14aa5fa)
772df60 | VERSION: 4.1.0rc4

Release the hounds!

Signed-off-by: Jeff Squyres [email protected]
38011d3 | Merge pull request open-mpi#8204 from jsquyres/pr/v4.1.x/fix-warnings

v4.1.x: fix many warnings
25161a0 | Merge pull request open-mpi#8247 from jsquyres/pr/v4.1.x/4.1.0rc4-ftw

VERSION: 4.1.0rc4
576db78 | coll/han: fix coll preference selection in mca_coll_han_comm_create_new

Exclude HAN, don't include it.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 33105b0)
0ae14c0 | Merge pull request open-mpi#8251 from devreal/fix-han-commselect-new-v4.1.x

v4.1.x: coll/han: fix coll preference selection in mca_coll_han_comm_create_new
0a819bf | Replace usage of the deprecated NB API of UCX with NBX

Signed-off-by: Leonid Genkin [email protected]
(cherry picked from commit 7f9a305)
529accb | coll/base: fix compiler warnings

Add some "const"s that needed to be applied here on the v4.1.x branch,
effectively by cherry-picking part of b65ec27 from master.

Signed-off-by: Jeff Squyres [email protected]
f566613 | opal: Remove outdated MacOS workaround

Remove the pack/unpack pragma around net/if.h on MacOS, which
was added to fix a bug in MacOS X 10.4.x on 64-bit platforms.
The bug was fixed in Mac OS X 10.5.0 and, sometime in the last
11 years, compilers started emitting warnings about the fact
that the Apple header stomped over the pragma pack settings
from the workaround. We already don't support versions of MacOS
earlier than 10.5, so there's no point in keeping the workaround.

Signed-off-by: Brian Barrett [email protected]
(cherry picked from commit a25df3f)
8324b4e | opal: Disable memory patcher component on MacOS

Open MPI doesn't support any transports on MacOS which require
memory manager hooks. The memory patcher component uses the
syscall interface, which has been deprecated in recent versions
of MacOS. Since we don't need it and it emits warnings about
deprecation, disable the memory patcher component on MacOS.

Fixes open-mpi#5671

Signed-off-by: Brian Barrett [email protected]
(cherry picked from commit 19e16d5)
6e2c8cf | Merge pull request open-mpi#8255 from jsquyres/pr/v4.1.x/fix-missed-warnings

v4.1.x: Fix missed compiler warnings
09e9fe0 | Fix the verbose output in ess base

Only get the locality string and output binding message when requested

Signed-off-by: Ralph Castain [email protected]
ab3fc05 | Merge pull request open-mpi#8256 from rhc54/cmr41/fix

v4.1.x: Fix the verbose output in ess base
97b2873 | Fixed uninitialzed memory access bug in base64 encoding.

Signed-off-by: Charles Shereda [email protected]
9970e00 | Merge pull request open-mpi#8254 from gleon99/v4.1.x

Replace usage of the deprecated NB API of UCX with NBX
d472f5a | Merge pull request open-mpi#8265 from cpshereda/v4.1.x

v4.1.x: Fixed uninitialized memory access bug in base64 encoding
da9ebda | Update PMIx to v3.2.2

Signed-off-by: Ralph Castain [email protected]
6760d53 | [v4.1.x] ompi : add memory barrier in PMIx registration callback

PMIx reigstration callback functions are used when regitering PMIx
event handler.

This patch adjusts two such callback functions:

model_registration_callback()
     in ompi/interlib/interlib.c and

ompi_errhandler_registration_callback()
     in ompi/errhandler/errhandler.c

Both of them employes the following code structure:

static void xxx_callback(int status,
size_t errhandler_ref,
void *cbdata)
{
myreg_t trk = (myreg_t)cbdata;

trk->status = status;
interlibhandler_id = errhandler_ref;
trk->active = false;

}

The workflow is:

  1. caller will call opal_pmix.register_evhandler() with
    callback function as an argument.
  2. caller will call OMPI_LAZY_WAIT_FOR_COMPLETION(trk.active)
    to wait for trk->active to became false,
  3. PMIx do the registration on anther thread, then call the
    registration callback function, which will set trk->active
    to false.
  4. caller check trk->status to determine whether registration
    succeeded.

The expected behavior of the registration callback functions therefore
is that trk->status be updated first, then trk->active be set to false.

However, on ARM based systems, the expected behavior is not guaranteed
because ARM uses a relaxed memory model.

To address this issue, this patch added a call to opal_atomic_wmb()
(write memory barrier) after trk->status being set, to achieve the
expected behavior.

Signed-off-by: Wei Zhang [email protected]
91b81d9 | Merge pull request open-mpi#8275 from wzamazon/v4.1.x_pmix_callback_wmb

[v4.1.x] ompi : add memory barrier in PMIx registration callback
cf17052 | Merge pull request open-mpi#8273 from rhc54/cmr41/pmix322

v4.1.x: Update PMIx to v3.2.2
fbc711e | VERSION: 4.1.0rc5

Updating VERSION and NEWS for the 4.1.0rc5 release.

Signed-off-by: Raghu Raja [email protected]
c65e9cb | Merge pull request open-mpi#8277 from rajachan/4.1.0rc5-version

VERSION: 4.1.0rc5
7b138ec | Update Slurm launch support

Assign all cpu's on node to the daemon

Signed-off-by: Ralph Castain [email protected]
(cherry picked from commit 7bac7ee)
7fd4f32 | Merge pull request open-mpi#8289 from rhc54/cmr41/slurm

v4.1.x: Update Slurm launch support
ff130e7 | ompio: resync v4.1 branch to master

this commit syncs ompio related directories in v4.1.x to master. The efforts to bring the lustre performance fixes and support for external32 data representation over were too overwhelming when dealing with every single pr individually.

There are a very few minor modification that had to be done for syncing:

  • v4.1.x does not have opal/mutex.h
  • v4.1.x does not have opal_atomic_int32_t datatype
  • the io module structure has two fewer function pointers (related to info_set/get) compared to the version on master.

Tested so far with the ompio testsuite as well as hdf5-1.10.5 testsuite (testphdf5, t_shapesame, t_bigio) on an XFS file system.
More tests on Lustre and BeeGFS to follow.

Signed-off-by: Edgar Gabriel [email protected]
b6c0bac | NEWS: OMPIO is now the default everywhere

Huzzah!

Signed-off-by: Jeff Squyres [email protected]
7de3993 | Merge pull request open-mpi#8297 from edgargabriel/pr/v4.1.x-ompio-sync

ompio: resync v4.1 branch to master
adb29bb | v4.1.0: README, VERSION, and LICENSE final updates

Signed-off-by: Jeff Squyres [email protected]
9ac5471 | Merge pull request open-mpi#8300 from jsquyres/pr/v4.1.0-final-final-final

v4.1.0: README and VERSION final updates
c854453 | VERSION: Onward to v4.1.1

Signed-off-by: Jeff Squyres [email protected]
5955da9 | Merge pull request open-mpi#8302 from jsquyres/pr/v4.1.x/onward-and-update-to-4.1.1

VERSION: Onward to v4.1.1
bd378db | oshmem/mca/sshmem: Fix build with --enable-mem-debug

--enable-mem-debug #defines realloc/free as macros, though macros
are also matched if they appear in references to members. Rename the
members to avoid this matching.

See open-mpi#6995

Signed-off-by: Bert Wesarg [email protected]
(cherry picked from commit 3111877)
0f20d67 | [v4.1.x] btl/ofi: fix memory leaks in error handling path

Currently, mca_btl_ofi_put (get, aop, afop, acswp) will allocate
a mca_btl_ofi_rdma_completion_t object and use it as the context
for fi_write/fi_read/fi_atomic/fi_fetch_atomic/fi_compare_atomic.

In normal code path, this completion object when processing completion
entry. However, when error happened when calling

fi_write/fi_read/fi_atomic/fi_fetch_atomic/fi_compare_atomic,

there will be no completion entry from libfabric, in this case the
completion object's memory is leaked.

This patch address the issue by calling opal_free_list_return() in
the error handling code path.

cherry picked from: 01f5d68

Signed-off-by: Wei Zhang [email protected]
99f6e39 | Merge pull request open-mpi#8318 from wzamazon/v4.1.x_fix_btl_ofi_leak

[v4.1.x] btl/ofi: fix memory leaks in error handling path
9249bb5 | Merge pull request open-mpi#8308 from devreal/fix-mem-debug-build-v4.1.x

oshmem/mca/sshmem: Fix build with --enable-mem-debug [4.1.x]
984d209 | Fix external PMIx v4.x check

  • PMIx v4.x is compatible with the external v3 component.

Signed-off-by: Joshua Hursey [email protected]
(cherry picked from commit 9d72db9)
6599551 | mtl/ofi: Add mising cq_data_size in hints for ofi mtl

Fixes open-mpi#8305
Signed-off-by: Goldman, Adam [email protected]
(cherry picked from commit 1e64da9)
838568d | A started generalized request should be marked as pending.

Fixes open-mpi#8340

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit 434a251)
286f5b6 | Merge pull request open-mpi#8348 from jsquyres/pr/v4.1.x/gen-request-fix

v4.1.x: Generalized request fix
08020aa | Revert "v4.1.x: Update Slurm launch support"

Signed-off-by: Tim Wickberg [email protected]
0b350e8 | Make a managed allocation filter a hostfile/hostlist.

If the user asks for a hostfile/hostlist inside of a managed allocation,
make sure that rmaps filters these and maps processes based on them. Otherwise,
it can result in inconsistent mappings across root and compute nodes if the
user orders their hostfile differently than the resource manager.

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit e14f80d)
1678b69 | Fix bug where orte under a managed allocation does not honor -host.

For example:

$. bsub -n 40 -m "node1 node2" mpirun -np 6 -host node1:2,node2:4 hostname

would not map two hostname processes to node1 and four to node2.
Instead, it would still think that each node1
and node2 had (for example) 20 cpu resources, and map accordingly.

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit 35cf87a)
b868673 | Fix --debug-daemons CLI option

Signed-off-by: Joshua Hursey [email protected]
92bfc09 | Remove the orte_static_ports rollup path

  • After discussing this with Ralph we concluded that the
    original code has some deficiencies that are not worth
    preserving.
    • The optimization here was that if we have a single
      static port then we can calculate the the URI of all
      of the daemons (including the HNP). Thus we do not
      have to have the daemons phone home to the HNP for
      the contact information. Instead the first message
      they receive would be the launch message.
    • This optimization path really only worked for a
      single static port, not a set of them.
    • This optimization wasn't used. As evidence by how
      long this bug has been present.
    • Finally, in practice, it didn't really save much time
      during launch.
  • Remove the build_daemon_nidmap from the regx framework structure

Signed-off-by: Joshua Hursey [email protected]
6de8dfd | Major update to the AVX* detection and support

  1. Consistent march flag order between configure and make.

  2. op/avx: give the option to skip some tests

it is possible to skip some intrinsic tests by setting some environment variables to "no" before invoking configure:

  • ompi_cv_op_avx_check_avx512
  • ompi_cv_op_avx_check_avx2
  • ompi_cv_op_avx_check_avx
  • ompi_cv_op_avx_check_sse41
  • ompi_cv_op_avx_check_sse3
  1. op/avx: update AVX512 flags

try
-mavx512f -mavx512bw -mavx512vl -mavx512dq
instead of
-march=skylake-avx512

since the former is less likely to conflict with user provided CFLAGS
(e.g. -march=...)

Thanks Bart Oldeman for pointing this.

  1. op/avx: have the op/avx library depend on libmpi.so

Refs. open-mpi#8323

Signed-off-by: Gilles Gouaillardet [email protected]
Signed-off-by: George Bosilca [email protected]
1c38632 | AVX code generation improvements

  1. Allow fallback to a lesser AVX support during make

Due to the fact that some distro restrict the compiule architecture
during make (while not setting any restrictions during configure) we
need to detect the target architecture also during make in order to
restrict the code we generate.

  1. Add comments and better protect the arch specific code.

Identify all the vectorial functions used and clasify them according to
the neccesary hardware capabilities.
Use these requirements to protect the code for load and stores (the rest
of the code being automatically generated it is more difficult to
protect).

  1. Correctly check for AVX* support.

Signed-off-by: George Bosilca [email protected]
ac6f658 | A better test for MPI_OP performance.

The test now has the ability to add a shift to all or to any of the
input and output buffers to assess the impact of unaligned operations.

Signed-off-by: George Bosilca [email protected]
b97838e | [4.1.x] orte/orted: enable OPAL's mutli-thread support

This patch added call to opal_set_using_threads() in orted/main.c,
which is to enable OPAL's multi-thread support.

This is because orted used multiple threads.

Without OPAL's multi-thread support, OPAL_RELEASE will not use
atomic operations to update an object's reference count, which
will lead to double free.

This patch is applied to 4.1.x directly because orte has been
removed from the master branch.

Signed-off-by: Wei Zhang [email protected]
f6c0a37 | Merge pull request open-mpi#8338 from jjhursey/v4.1-fix-pmix-4-ext3

Fix external PMIx v4.x check
a9c7412 | Merge pull request open-mpi#8339 from jjhursey/fix-static-port

v4.1: Fix segv when launching with static ports
90b70d5 | Merge pull request open-mpi#8355 from awlauria/managed_allocation_v4.1.x

v4.1.x: Fix a couple managed allocation issues.
5e26699 | Merge pull request open-mpi#8361 from bosilca/4.1/portable_avx

Bring the more flexible AVX* support in 4.1
2c583de | Merge pull request open-mpi#8362 from wzamazon/v4.1.x_orted_opal_use_threads

[4.1.x] orte/orted: enable OPAL's mutli-thread support
71c5896 | Merge pull request open-mpi#8346 from acgoldma/ofi-hint-cq_size-v4.1.x

v4.1.x: mtl/ofi: Add mising cq_data_size in hints for ofi mtl
5caffb7 | Always specify the target architecture for AVX

icc does not define the AVX* macros if the corresponding -m architecture
flag was not provided. Thus, make sure we always provide it for icc (not not
necessarily for gcc).

Signed-off-by: George Bosilca [email protected]
6ccca86 | fbtl/posix: ensure progressing aio requests

This commit fixes a bug discovered while debugging issue open-mpi#8350 Running our testsuite on Mac OS revealed that posted a large number of non-blocking read/write operations leads to an error message on this platform. A fix is already available and will be committed shortly.

The issue stems from limitations on macOs and the concurrent number of aio_read/aio_write operations that can be pending. While the code already handled that correctly for a single request, this bug exposed that the overall limited has to be respected across all pending requests.

The solution is to invoke mca_common_ompio_progress if we cannot post new aio operations.

Fixes issue open-mpi#8368

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit 0c4f4e2)
a13edf7 | Merge pull request open-mpi#8373 from bosilca/4.1/icc_avx

v4.1.x: Enable AVX support with Intel compilers
53bb30f | Merge pull request open-mpi#8375 from edgargabriel/pr/aio_progress-v4.1.x

v4.1.x: fbtl/posix: ensure progressing aio requests
95bb54c | Merge pull request open-mpi#8354 from wickberg/revert-8289-cmr41/slurm

Revert "v4.1.x: Update Slurm launch support"
1a9cc28 | Add fence_nb to flux pmix

opal_common_ucx_del_proc call fails if pmix doesn't implement fence_nb

Signed-off-by: Sami Ilvonen [email protected]
(cherry picked from commit b322df2)
Signed-off-by: Geoffrey Paulsen [email protected]
65fbffa | Early selection of the best PML.

With this patch the best PML is selected earlier, before finalizing
the others PML. This provides a simpler mechanism to intercept and
highjack the PML (as done in the monitoring PML)

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit 668aa15)
5de4423 | mca/pml: PML check for direct modex

For direct modex, all procs publish the selected pml module
and then at add_procs pml module for each proc is checked
against every other proc in the add_proc call.
For full modex, there is no change in functionality. Only Rank0
publishes its selected pml, all other procs in the add_proc call
check their selected pml against Rank0.
If pml's do not match, throw error and exit.

Signed-off-by: Dipti Kothari [email protected]
(cherry picked from commit 5418cc5)
eed4157 | MPI_Init_thread(3): update refs about MPI_THREAD_MULTIPLE

Thanks to Andreas Lösel for bringing the outdated docs to our
attention.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 0a52936)
1abbedd | MPI_Init_thread(3): fix statement about C++ binding

Thanks to Andreas Lösel for raising the inaccurate statement to our
attention.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 28508eb)
56eb572 | Adjust copyrights

Signed-off-by: Ralph Castain [email protected]
e455f7a | Merge pull request open-mpi#8401 from jsquyres/pr/v4.1.x/mpi-thread-multiple-man-page-updates

v4.1.x: MPI_Init_thread(3) man page updates
50974d5 | Merge pull request open-mpi#8397 from rhc54/cmr41/pml

v4.1.x: Update the PML selection/check logic to avoid direct modex "storms"
a15f2b9 | PML/UCX: don't do pml_check_selected call

Current implementation of pml check protocol causes extra
dmodex exchanges that may result in a significant performance
degradation for some workloads

(corresponds to master 36b64cb)

Signed-off-by: Valentin Petrov [email protected]
4421fce | Merge pull request open-mpi#8405 from vspetrov/v4.1.x_pml_ucx_check_selected_fix

V4.1.x PML/UCX: don't do pml_check_selected call
3248a77 | opal: disable the __atomic built-in atomics by default on AArch64

Benchmarks are showing better performance when not using the __atomic
built-ins on this arch. This PR disables them by default for this
architecture only.

Signed-off-by: Nathan Hjelm [email protected]
16f4797 | Merge pull request open-mpi#8413 from hjelmn/v4.1.x_use_hand_written_atomics_for_aarch64_by_default_because_the_builtins_are_inferior_for_our_usage

opal: disable the __atomic built-in atomics by default on AArch64
e51abf5 | config: Stash known-good copies of config.guess|sub

Download config.guess|sub from
https://git.savannah.gnu.org/gitweb/?p=config.git (at hash
6faca61810d335c7837f320733fe8e15a1431fc2) in order to fix
open-mpi#8410.

A future commit will install these files if they are newer than what
Autoconf installs.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 0ad2e81)
d6a5ac4 | autogen: use newer config.sub|guess if available

Per open-mpi#8410, have autogen.pl
check each config.sub and config.guess that it finds with a known-good
version of that file if the known-good version has a timestamp version
that is newer than what Autoconf installed.

We also skip updating anything in the 3rd-party tree; we don't really
want to mess with those packages.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 4a002ce)
4779d8e | Merge pull request open-mpi#8421 from jsquyres/pr/v4.1.x/autogen-config-dot-wot-updates-for-apple-m1

v4.1.x: Use newer config.guess / config.sub files when relevant
c049164 | Let Slurm know that our daemons are not MPI tasks

Signed-off-by: Ralph Castain [email protected]
70d13e5 | Make sure MPIR_Breakpoint() is compiled without CFLAGS.

In optimized builds, CFLAGS contains various optimizations such as -O3,
and is propogated by automake to all files. To work-around this,
isolate MPIR_Breakpoint() and other MPIR_* symbols into its own library
built with debugger specific CFLAGS.

To prevent CFLAGS from being polluted elsewhere in the make tree, build
this in its own tiny stand-alone makefile.

Fixes open-mpi#7757

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit 6d82003)
2974b60 | Merge pull request open-mpi#8423 from rhc54/cmr41/slm

Let Slurm know that our daemons are not MPI tasks
08b1f2d | Merge pull request open-mpi#8428 from awlauria/fix_mpir_breakpoint_v4.1.x

v4.1.x: Make sure MPIR_Breakpoint() is compiled without CFLAGS.
2823d97 | Merge pull request open-mpi#8394 from gpaulsen/topic/v4.1.x/AddFence_nmToFluxPmix

Add fence_nb to flux pmix
a4c1fd8 | op_avx: use MCA enum flags instead of integer values

MCA enums make it easier for users to see/set MCA flag values. Also
add "op_avx_capabilities" read-only MCA var that shows what is supported
on your platform, regardless of what value the user has set in
"opal_avx_support".

For example, ompi_output shows all the valid values:

$ ompi_info --all --parsable | grep _avx_
mca:op:avx:param:op_avx_capabilities:value:SSE,SSE2,SSE3,SSE4.1,AVX
mca:op:avx:param:op_avx_capabilities:source:default
mca:op:avx:param:op_avx_capabilities:status:read-only
mca:op:avx:param:op_avx_capabilities:level:4
mca:op:avx:param:op_avx_capabilities:help:Level of SSE/MMX/AVX support available in the current environment
mca:op:avx:param:op_avx_capabilities:enumerator:value:1:SSE
mca:op:avx:param:op_avx_capabilities:enumerator:value:2:SSE2
mca:op:avx:param:op_avx_capabilities:enumerator:value:4:SSE3
mca:op:avx:param:op_avx_capabilities:enumerator:value:8:SSE4.1
mca:op:avx:param:op_avx_capabilities:enumerator:value:16:AVX
mca:op:avx:param:op_avx_capabilities:enumerator:value:32:AVX2
mca:op:avx:param:op_avx_capabilities:enumerator:value:256:AVX512F
mca:op:avx:param:op_avx_capabilities:enumerator:value:512:AVX512BW
mca:op:avx:param:op_avx_capabilities:deprecated:no
mca:op:avx:param:op_avx_capabilities:type:int
mca:op:avx:param:op_avx_capabilities:disabled:false
mca:op:avx:param:op_avx_support:value:SSE,SSE2,SSE3,SSE4.1,AVX
mca:op:avx:param:op_avx_support:source:default
mca:op:avx:param:op_avx_support:status:read-only
mca:op:avx:param:op_avx_support:level:4
mca:op:avx:param:op_avx_support:help:Level of SSE/MMX/AVX support to be used, capped by the local architecture capabilities
mca:op:avx:param:op_avx_support:enumerator:value:1:SSE
mca:op:avx:param:op_avx_support:enumerator:value:2:SSE2
mca:op:avx:param:op_avx_support:enumerator:value:4:SSE3
mca:op:avx:param:op_avx_support:enumerator:value:8:SSE4.1
mca:op:avx:param:op_avx_support:enumerator:value:16:AVX
mca:op:avx:param:op_avx_support:enumerator:value:32:AVX2
mca:op:avx:param:op_avx_support:enumerator:value:256:AVX512F
mca:op:avx:param:op_avx_support:enumerator:value:512:AVX512BW
mca:op:avx:param:op_avx_support:deprecated:no
mca:op:avx:param:op_avx_support:type:int
mca:op:avx:param:op_avx_support:disabled:false

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 566412e)
032871b | Merge pull request open-mpi#8441 from jsquyres/pr/v4.1.x/op-avx-enum

v4.1.x: op_avx: use MCA enum flags instead of integer values
36f82a1 | common_ompio_file_set_view: fix handling of MPI_DISPLACEMENT_CURRENT

If MPI_MODE_SEQUENTIAL was used when opening the file, the special displacement MPI_DISPLACEMENT_CURRENT
has to be used during file_set_view. The displacement is set to the current position of the shared file pointer
in this case. It is illegal to use MPI_DISPLACEMENT_CURRENT unless amode MPI_MODE_SEQUENTIAL was used.

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit 6168dde)
5ea77d5 | fbtl_posix_progress: aio_return can indicate partial completion

aio_return returns the number of bytes written/read, and can indicate a partial completion.
This fix ensures that a partially completed aio_read/write operation is reposted correctly.
Fixes the i_bigtype test from the mpich testsuite.

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit 79561ee)
2d8c2d4 | common_ompio_file_set_view: recognize negative disp in access

you cannot access parts of a file if the file view contains a description
that leads ultimately to a negative offset. This fix ensures that
we return an error in this case (MPI_ERR_IO).

This fix was triggered by an investigation into mpich/test/mpi/io/tst_fileview testcase.
Running this test with ompio still leads to a number of 'failures' since
we return MPI_ERR_TYPE in some instances, while the testcode expects MPI_ERR_IO.
I will not fix those issues, since this is like playing guacamole (fixing the error
code expected by one testsuite makes another testsuite fail). However, this commit
fixes the one case where we returned MPI_SUCCESS instead of an error code.

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit 0761c0b)
d38ed16 | Merge pull request open-mpi#8445 from edgargabriel/pr/v4.1.x-mpich3.4-tstsuite-fixes

Pr/v4.1.x mpich3.4 tstsuite fixes
e076122 | osc/rdma: ensure bml add_procs has been called for all local procs

This fixes a bug when ob1 was not selected as the pml but osc/rdma may be
selected for an MPI window. In some cases we may use btl/sm. If this is the
case we need to ensure btl/sm knows about all the local procs (not just the
ones in the communicator). This is required for btl/sm to correctly function
at this time.

In the future btl/sm should be made more resilient.

Fixes open-mpi#8434

Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 8040d05)
ccd71d5 | op_avx: Fix MCA enum flags

Remove accidental double registration (which resulted in a double
RELEASE/free).

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 4d4957c)
7e7862e | common/ofi: Use opal_show_help() to call out lack of locality info

opal_show_help() can dedup output across ranks when using mpirun. Print
the help text only when the OFI component's verbosity is >= 10.

Signed-off-by: Raghu Raja [email protected]
aa7f9d3 | Merge pull request open-mpi#8460 from jsquyres/pr/v4.1.x/fix-avx-enum-flags

v4.1.x: op_avx: Fix MCA enum flags
aaed1a2 | Merge pull request open-mpi#8453 from hjelmn/v4.1.x_call_add_procs_on_all_allocated_procs_if_osc_rdma_is_selected_in_case_btl_ucx_is_in_use_for_two_sided_really_i_need_to_determine_the_best_way_to_ensure_that_vader_works_in_this_case_without_slowing_it_down

osc/rdma: ensure bml add_procs has been called for all local procs
380ac96 | Merge pull request open-mpi#8447 from rajachan/package_rank_help

common/ofi: Use opal_show_help() to call out lack of locality info
2511f97 | Merge pull request open-mpi#7899 from bureddy/v4.1.x-cuda-ucx

v4.1.x: UCX: initialize cuda from ucx pml component
e6fb42b | NEWS and VERSION updates for 4.1.1rc1

Signed-off-by: Raghu Raja [email protected]
Co-authored-by: Jeff Squyres [email protected]
34075d3 | Merge pull request open-mpi#8446 from rajachan/411rc1-news

NEWS and VERSION updates for 4.1.1rc1
0e36e48 | Update PMIx to v3.2.3

Signed-off-by: Ralph Castain [email protected]
ae74eaa | Merge pull request open-mpi#8477 from rhc54/cmr41/pmx

v4.1.x: Update PMIx to v3.2.3
bf62e33 | First cut at Git commit checks as Github Actions

This will replace the old "Signed-off-by checker" and "Commit email
checker". Both of those checks are now subsumed into this Github
Action, and we also introduce a new functionality: checking the
"cherry picked from commit xyz" messages (slightly obfuscated here in
the commit message so that it does not cause the test to fail!).

  1. If a cherry picked from commit abc123 message is found in a git
    commit message, verify that that commit actually exists in the main
    Open MPI repo. If it doesn't, fail the CI test.
  2. The config file in the git repo
    .github/workflows/git-commit-checks.json indicates whether
    cherry-pick messages are required in commit messages.
    1. The contents of that file on the target branch determine
      whether cherry pick messages are required on that branch or
      not. Meaning: we'll set the contents of that file to not
      require cherry pick messages on master. When we branch for
      releases, we change that config file on the new branch to
      require cherry pick messages.
    2. When cherry pick messages are required and the PR contains
      commits that do not have cherry pick messages, fail the CI
      test.
    3. When cherry-pick messages are required, Reverts, Merge commits,
      and commits that are entirely comprised of submodule updates
      are explicitly excluded from this requirement. Example:
      1. A PR is created to a target branch with the cherry pick
        message requirement is enabled.
      2. The PR branch contains commits with (cherry picked from commit ....) messages, and the commit hashes mentioned all
        exist on master.
      3. The PR branch contains a Revert commit.
      4. The PR branch contains a Merge commit.
      5. The CI test will pass.
    4. If a magic token is present in the PR description (e.g.,
      bot:notacherrypick), then the requirement for cherry pick
      messages is disabled for all commits on that PR.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit f54b614)
a404cb9 | git-commit-check: fix typo

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit d4d1bcf)
2f042fb | git-commit-checker: require cherry picks on this branch

Signed-off-by: Jeff Squyres [email protected]
3230c11 | Merge pull request open-mpi#8482 from jsquyres/pr/v4.1.x/git-commit-checker-github-action

v4.1.x: git commit checker GitHub action
9dcaab4 | osc/rdma: fix errors in derived datatype handling for accumulate

This commit fixes a number of bugs in the handling of derived
datatypes when using MPI__Accumulate, MP_Get_accumulate, and
MPI_Fetch_and_op. The following bugs are fixed:

  • Incorrect results when using MPI_OP_NO_OP with a 0 origin
    count. osc/rdma was not ignoring the source address, count, and
    datatype in the MPI_OP_NO_OP case as is required by the standard.

  • Correctly handle result_buffer=MPI_BOTTOM in
    ompi_osc_rdma_gacc_local().

  • Test result_datatype in order to figure out whether a get is
    performed.

References open-mpi#6275

Signed-off-by: Gilles Gouaillardet [email protected]
Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 3ccf7e3)
0addf8f | osc/rdma: rearrange accumulate code

This commit rearranges the accumulate code so that network AMOs can be
used in a larger number of potential situations. This commit adds a
new MCA variable: osc_rdma_network_max_amo. This variable controls the
maximum datatype count where AMOs will be used. The old default for
this support was count == 1. The new default is count == 32.

Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 2fb5f55)
dd6e673 | osc/rdma: remove extra retain on fop

This commit fixes a resource leak when using network atomics. There
was a leak when the underlying BTL did not support the attempted
atomic.

Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 9b2ed1e)
6d89d39 | osc/rdma: fix amo-based accumulate

This fixes an issue in osc/rdma when AMOs are used for accumulate operations
(vs get-accumulate). In this case a temporary buffer is needed to hold the
result of the operation (since fetching atomics are used). This buffer was

  1. zero-sized, and 2) not freed. Both of these issues are fixed in this
    commit. There was also an issue with unpacking due to using an uninitialized
    convertor. The convertor is no longer passed in when no result is required.

Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit d55504f)
041642c | OSC/RDMA: fix typo in btl selection logic

Signed-off-by: Howard Pritchard [email protected]
(cherry picked from commit 1d4f6bf)
5a66363 | osc/rdma: Tighten up concurrent memory region access.

There were some instances where the exclusive lock needed some
tightening around the region structure.

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit 7856a25)
87b6994 | Merge pull request open-mpi#8503 from awlauria/osc_fixes_backport_v4.1.x

Osc fixes backport v4.1.x
3b9bff9 | fs/lustre: Remove unneeded includes

The functionality was migrated to fs/base/fs_base_get_parent_dir.c long
ago, but the includes stayed. Though in lustre 2.14 lustre_user.h
moved the inclusion of linux/fs.h outside the __KERNEL__ guard. This
triggered now Debian bug #898743 [1], which states that including
sys/mount.h after linux/fs.h breaks compilation. Thus the include
removal also avoids this breakage.

Closes open-mpi#8508.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898743

Signed-off-by: Bert Wesarg [email protected]
(cherry picked from commit 5b525b2)
c99b815 | ucx: disable version 1.8

Signed-off-by: Yossi Itigin [email protected]
(cherry picked from commit b49cbf4)
939488c | pml/ob1: fix build issue in CUDA path

open-mpi@916c29a

Signed-off-by: Aboorva Devarajan [email protected]
(cherry picked from commit aaffafc)
a33e1fa | Merge pull request open-mpi#8517 from yosefe/topic/ucx-disable-version-1-8-v4.1.x

v4.1.x: ucx: disable version 1.8
cd13930 | Prevent the establishment of new BTL connections during matching
handshake

Prevent a "deadlock" scenario, when one of the processes leave the
matching before the ack has been sent to the peer. Such a scenario has
been described by @bwbarrett in open-mpi#8498.

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit 916c29a)
Signed-off-by: Brian Barrett [email protected]
d78c1bc | Merge pull request open-mpi#8534 from bwbarrett/backports/v4.1.x-pml-ack-on-send-btl

Prevent the establishment of new BTL connections during matching
feadc68 | Bull update of coll/han : added barrier, a 'simple' scatter, some Doxygen and some fixes

This completes and fixes current code for coll/Han:

  • a barrier is added
  • a "simple" scatter is added
  • some Doxygen documentation is added

Fix:

  • compilation and cppcheck warnings
  • corner case errors when parsing a rule file

Signed-off-by: Emmanuel Brelle [email protected]
(cherry picked from commit ca663de)
953b9ad | Merge pull request open-mpi#8521 from AboorvaDevarajan/fix_build_v4.1.x

v4.1.x: pml/ob1: fix build issue in CUDA path
0f77818 | Merge pull request open-mpi#8487 from EmmanuelBRELLE/v4.1.x_Bull_2020_update_for_Han

v4.1.x: Bull 2020 update of coll/han
3dd378c | git-commit-checks: use a better name

Github shows both the "outer" and "inner" names on the CI line in the
Github web UI, so make sure to give good names for both.

Signed-off-by: Jeff Squyres [email protected]
(cherry picked from commit 73f2926)
8621352 | Merge pull request open-mpi#8541 from jsquyres/pr/v4.1.x/git-commit-checker-better-name

v4.1.x: git commit checker better name
c36d745 | ucx: check supported transports and devices for setting priority

Add "pml_ucx_tls" parameter to control the transports to include or
exclude (with a '^' prefix). UCX will be enabled only if one of the
included transports is present, or if a non-excluded transport is
present (in case of exclude-list with a '^' prefix).

Add "pml_ucx_devices" parameter to control the devices which make UCX
component set a high priority for itself. If none of the listed devices
is present, the priority will be set to 19 - lower than ob1 priority.

Signed-off-by: Yossi Itigin [email protected]
(cherry picked from commit 562c57c)
ebe8e76 | Merge pull request open-mpi#8549 from yosefe/topic/pml-ucx-set-priority-v4.1.x

v4.1.x: ucx: check supported transports and devices for setting priority
9d9b98a | Check for librt when building LSF support

Signed-off-by: Joshua Hursey [email protected]
Co-authored-by: Alexei Colin [email protected]
1cfce38 | Merge pull request open-mpi#8515 from edgargabriel/pr/fix-lustre-2.14-v4.1.x

v4.1.x: fs/lustre: Remove unneeded includes
9369fbb | LSF Config: Cleanup logic

Signed-off-by: Joshua Hursey [email protected]
a093431 | Merge pull request open-mpi#8564 from jjhursey/v4.1-fix-lsf-conf

v4.1: Check for librt when building LSF support
73a4949 | SPML/UCX: removed direct dependency to SPML UCX

  • added synchronise_quiet parameter to local
    context object

Signed-off-by: Sergey Oblomov [email protected]
(cherry picked from commit 01d7164)
88272fb | ompi/group: fix proc pointer comparison in groups

To avoid checking sentinel process pointers to the original ompi_proc_t
pointers compare the processes in the groups using process names.

Signed-off-by: Aboorva Devarajan [email protected]
(cherry picked from commit 0f2c70c)
e4be2d2 | Fix case where debuggers cannot read the MPIR proctable.

Make sure the definition of the MPIR_Proctable
is in a header file that is included in the file
orted_mpir_breakpoint.c, which is compiled with -g
and compiled without optimizations.

Otherwise, the debugger (such as gdb) won't know
the complete definition of the proctable, preventing
it from being able to read it.

Since the MPIR_proctable should be accessed from
orted_submit.c and orted_mpir_breakpoint.c, move it
to the mpir_orted.h header file.

See issue: open-mpi#8563

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit a71fbaf)
7667614 | Merge pull request open-mpi#8571 from awlauria/mpir_fix_v4.1

v4.1.x: Fix case where debuggers cannot read the MPIR proctable.
ca8b695 | Merge pull request open-mpi#8609 from AboorvaDevarajan/fix_group_mt_v4.1.x

v4.1.x: ompi/group: fix proc pointer comparison in groups
29e6467 | Merge pull request open-mpi#8584 from hoopoepg/topic/ucx-atomic-dependencies-v4.1

SPML/UCX: removed direct dependency to SPML UCX - v4.1
5e07bd4 | A new binomial scatter using packed data on intermediary processes.

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit 21e4d87)
fe163d4 | Merge pull request open-mpi#8618 from jsquyres/pr/v4.1.x/coll-base-scatter-fixes

v4.1.x: A new binomial scatter using packed data on intermediary processes.
0c02983 | gcc_builtin: fix performance regression on x86_64

in order to work around a bug in older gcc versions on x86_64,
__atomic_thread_fence (__ATOMIC_SEQ_CST)
was replaced with
__atomic_thread_fence (__ATOMIC_ACQUIRE)
based on the asumption that this did not introduce performance regressions.

It was recently found that this did introduce some performance regression,
mainly at scale on fat nodes.

So simply use an asm memory globber to both workaround older gcc bugs
and fix the performance regression.

Thanks S. Biplab Raut for bringing this issue to our attention.

Refs. open-mpi#8603

Signed-off-by: Gilles Gouaillardet [email protected]

(cherry picked from commit d7e3f87)
455a28d | Fix/Cleanup the return value documentation for mpirun

Signed-off-by: Joshua Hursey [email protected]
58a6e10 | Merge pull request open-mpi#8623 from ggouaillardet/topic/v4.1.x/gcc_builtin_workaround

v4.1.x: gcc_builtin: fix performance regression on x86_64
d3d1e18 | Merge pull request open-mpi#8627 from jjhursey/v4.1-fix-orterun-man

v4.1.x: Fix/Cleanup the return value documentation for mpirun
9f6d473 | NEWS updates for v4.1.1rc2

Signed-off-by: Raghu Raja [email protected]
da66bc1 | coll/base: reduce memory consumption in Scatter

This PR reduces memory consumption in non-root and non-leaf processes of binomial tree algorithm for Scatter operation.
(cherry picked from commit a2cd6a9)

Signed-off-by: Mikhail Kurnosov [email protected]
263fbb1 | VERSION updates for v4.1.1rc2

Signed-off-by: Raghu Raja [email protected]
4153975 | Merge pull request open-mpi#8636 from mkurnosov/v4.1.x/scatter-fix-tmpbuf

v4.1.x: coll/base: reduce memory consumption in Scatter
cf7d9d2 | Merge pull request open-mpi#8631 from rajachan/411rc2

NEWS and VERSION updates for 4.1.1rc2
94e98cb | Fix man page for MPI_Win_attach

The text seems to have been copied from MPI_Win_allocate and was
thus incorrect.

Signed-off-by: Joseph Schuchart [email protected]
(cherry picked from commit 22cccd8)
d5570af | pml/ucx: ignore request leak by default, override by mca param

Signed-off-by: Yossi Itigin [email protected]
(cherry picked from commit 6672d07)
3c7ea15 | Powerpc atomics: Force usage of powerpc assembly.

The builtins used by default on Power have been
shown to perform poorly. For the time being, force
all compilers to use the inline assembly until
atomic builtins catch-up.

This changes the defaults for all compilers sans xl, including:
gcc, clang, and pgi to use the assembly.

Previously, all of the above were using C11 or
the gcc builtins.

Bonus:
Add a configure flag to force Power machines to use
the builtins/C11, depending on what is available. This
will make future testing easier.

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit e3f3c5b)
881df53 | Merge pull request open-mpi#8645 from devreal/fix-mpi-win-attach-man-v4.1.x

Fix man page for MPI_Win_attach [4.1.x]
e092034 | Merge pull request open-mpi#8708 from awlauria/ppc_atomics_v4.1.x

v4.1.x: Powerpc atomics: Force usage of powerpc assembly.
0e20abd | Merge pull request open-mpi#8706 from yosefe/topic/pml-ucx-ignore-request-leak-by-default-v4.1.x

pml/ucx: ignore request leak by default, override by mca param
d259c51 | ofi: fix typo in macro name

This is a one-off commit for the release branch.

Signed-off-by: Gilles Gouaillardet [email protected]
95ea528 | Merge pull request open-mpi#8751 from ggouaillardet/topic/v4.1.x/pmix_package_rank

ofi: fix typo in macro name
0728e60 | Fix language text for example

Code snippet appears to be C not Fortran.

Signed-off-by: Harumi Kuno [email protected]
(cherry picked from commit 8c8b7bd)
fabf95e | Fix .so filenames

Actual file names have substring: xor_to_all

Signed-off-by: Harumi Kuno [email protected]
(cherry picked from commit aaf69b7)
d805705 | Merge pull request open-mpi#8784 from hkuno/pr/v4.1.x

v4.1.x: man pages updates
b8a8096 | atomic/gcc_builtin: only apply the workaround when required.

A performance regression was reported when using the workaround
__asm__ __volatile__("" : : : "memory");
instead of
__atomic_thread_fence(__ATOMIC_ACQUIRE);
on a large SMP with recent GCC compiler.

So only use the workaround on x86_64 when a busted GCC compiler is used.

Thanks S. Biplab Raut for reporting this issue.

Signed-off-by: Gilles Gouaillardet [email protected]

(back-ported from commit open-mpi/ompi@711c8c2)
0f312ca | Merge pull request open-mpi#8795 from ggouaillardet/topic/v4.1.x/busted_atomic_revamps

v4.1.x: atomic/gcc_builtin: only apply the workaround when required.
98ac708 | Add the userid to the vader backing file path

Fixes open-mpi#7308

Signed-off-by: Ralph Castain [email protected]
(cherry picked from commit 88be263)
887d46c | Retrieve cpuset when configured with pmix rte

When configured --with-ompi-pmix-rte, ensure the pmix_process_info_t
structure has a cpuset entry and that it gets set.

Signed-off-by: Ralph Castain [email protected]
d7797df | Merge pull request open-mpi#8804 from rhc54/cmr41/vdr

Add the userid to the vader backing file path
532998c | OSHMEM/SEGMENT-REGISTRATION: added segment filtering

  • all file-mapped segments are filtered by start-process
    segments only
  • segments from another modules are ignored

Signed-off-by: Sergey Oblomov [email protected]
(cherry picked from commit b22e275)
70eec6a | mtl/ofi: Disable CUDA convertor for specified ofi providers

This patch is only in v4.x as code in v5.x was rewritten to use FI_HMEM
and there is no plan to backport the related patches.

Refs: 8762
Signed-off-by: Goldman, Adam [email protected]
dc42f40 | Merge pull request open-mpi#8821 from acgoldma/v4.1.x-ofi-cuda-perf

[v4.1.x] mtl/ofi: Disable CUDA convertor for specified ofi providers
a9efd68 | Merge pull request open-mpi#8815 from hoopoepg/topic/oshmem-start-proc-segment-filter-v4.1

OSHMEM/SEGMENT-REGISTRATION: added segment filtering - V4.1
8151426 | Merge pull request open-mpi#8805 from rhc54/cmr41/cpus

Retrieve cpuset when configured with pmix rte
833f2e7 | dist: Prep for 4.1.1rc3

Signed-off-by: Brian Barrett [email protected]
9a4fa40 | Merge pull request open-mpi#8830 from bwbarrett/release/v4.1.1

dist: Prep for 4.1.1rc3
0b3c1d9 | pmix/pmix3x: Fix internal PMIx discovery logic.

See open-mpi#8823 for the details.

Signed-off-by: Artem Polyakov [email protected]
2210251 | pmix: Fix detection of Externally-built PMIx

See open-mpi#8823 for more details.

Signed-off-by: Artem Polyakov [email protected]
9ed670e | Always include the stddef.h header.

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit e8ebe13)
454d071 | Fix error with stricter quoting requirements of autoconf-2.70

Signed-off-by: Christoph Niethammer [email protected]
(cherry picked from commit 9901325)
4b2ad7f | Fix "variadic macros" warning.

Signed-off-by: Austen Lauria [email protected]
(cherry picked from commit ef28e8d)
2e583f4 | Reenable the heterogeneous support.

This commit fixes the support for heterogeneous environments and
specifically for external32. The root cause was that during the datatype
optimization process types that are contiguous in memory are collapsed
together in order to decrease the number of conversion (or memcpy)
function calls. The resulting type however, does not have the same
conversion rules as the types it replaced, leading to an incorrect (or
absent) conversion in some cases. This patch marks the datatypes where
types have been collapsed during the optimization process with a flag,
allowing the convertor to detect if the optimized type can be used in
heterogeneous setups.

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit 73d64cb)
06f3364 | Fixing the partial pack unpack issue.

When unpacking a partial predefined element check the boundaries of the
description vector type, and adjust the memory pointer accordingly (to
reflect not only when a single basic type was correctly unpacked, but
also when an entire blocklen has been unpacked).

Signed-off-by: George Bosilca [email protected]
(cherry picked from commit fb07960)
565d72e | Fix the Makefile to include the correct test.

Signed-off-by: George Bosilca [email protected]

bot:notacherrypick
6ce9b9d | Merge pull request open-mpi#8832 from artpol84/topic/v4.1.x/fix_pmix_detection

pmix/pmix3x: Fix internal PMIx discovery logic.
a01165f | Merge pull request open-mpi#8837 from bosilca/backport/datatype

Backport/datatype
40abb13 | dist: Update VERSION and README for v4.1.1rc4

Signed-off-by: Brian Barrett [email protected]
a8dd870 | Merge pull request open-mpi#8839 from bwbarrett/release/v4.1.1

dist: Update VERSION and README for v4.1.1rc4

📝 Please access here to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment: /check-cla to verify. Thanks.


  • If you've already signed a CLA, it's possible you're using a different email address for your gitee account. Check your existing CLA data and verify the email.
  • If you signed the CLA as an employee or a member of an organization, please contact your corporation or organization to verify you have been activated to start contributing.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please feel free to file an issue.

@CLAassistant
Copy link

CLAassistant commented Nov 30, 2021

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 17 committers have signed the CLA.

✅ JKLiang9714
❌ jsquyres
❌ rajachan
❌ bosilca
❌ jjhursey
❌ mkurnosov
❌ ggouaillardet
❌ devreal
❌ hoopoepg
❌ yosefe
❌ hkuno
❌ rhc54
❌ awlauria
❌ artpol84
❌ cniethammer
❌ bwbarrett
❌ acgoldma
You have signed the CLA already but the status is still pending? Let us recheck it.

@ChenQiangFYQ ChenQiangFYQ merged commit a859eb9 into kunpengcompute:huawei Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants