-
Notifications
You must be signed in to change notification settings - Fork 937
Updated ABI generation code and new libraries #13280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8fbf968 to
29b1a2f
Compare
da515d4 to
340e7ad
Compare
|
Maybe you should somehow vendor the In short, I think it would be in everyone's convenience to use https://github.com/mpi-forum/mpi-abi-stubs as the "source of truth" for ABI-related stuff, avoiding manual synchronization of handle/constant values. |
a9b105c to
cefbde8
Compare
|
I wonder if we should make the bot not complain about unsigned commits on draft PRs. That would reduce some of the noise on PR's like this. |
cefbde8 to
d5663f7
Compare
Turns out that in commit 6bd36a7 we had a function that is not part of the MPI standard. This showed while working on ABI support - which requires us to pay attention to the truth rather than make stuff up. This commit removes our made up MPI_Session_set_info method. Turns out who ever was doing the fortran bindings knew this wasn't a method in the standard so there's no need to change the fortran bindings. Same thing applies to the man pages. Related to open-mpi#13280 Signed-off-by: Howard Pritchard <[email protected]>
65e34c9 to
bd2710f
Compare
|
@hppritcha There is some issue with out-of-source builds, i.e |
|
interesting distcheck didn't check this. |
|
jenkins ci runs make distcheck |
|
The ABI mpi.h header is missing some MPI_T_XXX types. Also, the Status f08/c converters are declared, and they should not. I did the following manual edits to the installed mpi.h header: diff -up ./mpi.h.orig ./mpi.h
--- ./mpi.h.orig 2025-08-28 18:55:49.842968779 +0300
+++ ./mpi.h 2025-08-28 19:03:07.192957305 +0300
@@ -490,6 +490,13 @@ enum {
/* C preprocessor constants and Fortran parameters */
/* $CATEGORY:C_PREPROCESSOR_CONSTANTS_FORTRAN_PARAMETERS$ */
+typedef struct MPI_T_enum_t* MPI_T_enum;
+typedef struct MPI_T_cvar_handle_t* MPI_T_cvar_handle;
+typedef struct MPI_T_pvar_handle_t* MPI_T_pvar_handle;
+typedef struct MPI_T_pvar_session_t* MPI_T_pvar_session;
+typedef struct MPI_T_event_registration_t* MPI_T_event_registration;
+typedef struct MPI_T_event_instance_t* MPI_T_event_instance;
+
/* Handles used in the MPI tool information interface */
#define MPI_T_ENUM_NULL ((MPI_T_enum) 0)
#define MPI_T_CVAR_HANDLE_NULL ((MPI_T_cvar_handle) 0)
@@ -558,20 +565,20 @@ enum {
};
/* Source event ordering guarantees in the MPI tool information interface */
-enum {
+typedef enum MPI_T_source_order {
MPI_T_SOURCE_ORDERED = 1,
MPI_T_SOURCE_UNORDERED = 2,
-};
+} MPI_T_source_order;
/*
* Callback safety requirement levels used in the MPI tool information interface
*/
-enum {
+typedef enum MPI_T_cb_safety {
MPI_T_CB_REQUIRE_NONE = 0x00,
MPI_T_CB_REQUIRE_MPI_RESTRICTED = 0x03,
MPI_T_CB_REQUIRE_THREAD_SAFE = 0x0F,
MPI_T_CB_REQUIRE_ASYNC_SIGNAL_SAFE = 0x3F,
-};
+} MPI_T_cb_safety;
/* Callback functions */
@@ -1107,13 +1114,13 @@ int MPI_Ssend_init(const void* buf, int
int MPI_Ssend_init_c(const void* buf, MPI_Count count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request* request);
int MPI_Start(MPI_Request* request);
int MPI_Startall(int count, MPI_Request array_of_requests[]);
-/* int MPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
- */int MPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
-int MPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
-/* int MPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
- *//* int MPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
- *//* int MPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
- */int MPI_Status_get_error(const MPI_Status* status, int* err);
+// /* int MPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
+// */int MPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
+// int MPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
+// /* int MPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
+// *//* int MPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
+// *//* int MPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
+// */int MPI_Status_get_error(const MPI_Status* status, int* err);
int MPI_Status_get_source(const MPI_Status* status, int* source);
int MPI_Status_get_tag(const MPI_Status* status, int* tag);
int MPI_Status_set_cancelled(MPI_Status* status, int flag);
@@ -1799,13 +1806,13 @@ int PMPI_Ssend_init(const void* buf, int
int PMPI_Ssend_init_c(const void* buf, MPI_Count count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request* request);
int PMPI_Start(MPI_Request* request);
int PMPI_Startall(int count, MPI_Request array_of_requests[]);
-/* int PMPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
- */int PMPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
-int PMPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
-/* int PMPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
- *//* int PMPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
- *//* int PMPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
- */int PMPI_Status_get_error(const MPI_Status* status, int* err);
+// /* int PMPI_Status_c2f(const MPI_Status* c_status, MPI_Fint* f_status);
+// */int PMPI_Status_c2f08(const MPI_Status* c_status, MPI_F08_status* f08_status);
+// int PMPI_Status_f082c(const MPI_F08_status* f08_status, MPI_Status* c_status);
+// /* int PMPI_Status_f082f(const MPI_F08_status* f08_status, MPI_Fint* f_status);
+// *//* int PMPI_Status_f2c(const MPI_Fint* f_status, MPI_Status* c_status);
+// *//* int PMPI_Status_f2f08(const MPI_Fint* f_status, MPI_F08_status* f08_status);
+// */int PMPI_Status_get_error(const MPI_Status* status, int* err);
int PMPI_Status_get_source(const MPI_Status* status, int* source);
int PMPI_Status_get_tag(const MPI_Status* status, int* tag);
int PMPI_Status_set_cancelled(MPI_Status* status, int flag);I'm able to compile (using |
|
@hppritcha Actually, take a look at mpi-forum/mpi-abi-stubs#63 |
5377de6 to
04ff2ff
Compare
|
The installed $ mpicc_abi helloworld.c
$ mpiexec -n 1 ./a.out
[optiplex:00000] *** An error occurred in MPI_Init_thread
[optiplex:00000] *** reported by process [3633774593,0]
[optiplex:00000] *** on a NULL communicator
[optiplex:00000] *** MPI_ERR_ARG: invalid argument of some other kind
[optiplex:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[optiplex:00000] *** and MPI will try to terminate your MPI job as well)
--------------------------------------------------------------------------
prterun has exited due to process rank 0 with PID 0 on node optiplex calling
"abort". This may have caused other processes in the application to be
terminated by signals sent by prterun (as reported here).
-------------------------------------------------------------------------- |
|
@hppritcha The following definitions in the generated /* Predefined functions */
#define MPI_COMM_NULL_COPY_FN ((MPI_Comm_copy_attr_function) 0)
#define MPI_COMM_DUP_FN ((MPI_Comm_copy_attr_function) 1)
#define MPI_COMM_NULL_DELETE_FN ((MPI_Comm_delete_attr_function) 0)
#define MPI_WIN_NULL_COPY_FN ((MPI_Win_copy_attr_function) 0)
#define MPI_WIN_DUP_FN ((MPI_Win_copy_attr_function) 1)
#define MPI_WIN_NULL_DELETE_FN ((MPI_Win_delete_attr_function) 0)
#define MPI_TYPE_NULL_COPY_FN ((MPI_Type_copy_attr_function) 0)
#define MPI_TYPE_DUP_FN ((MPI_Type_copy_attr_function) 1)
#define MPI_TYPE_NULL_DELETE_FN ((MPI_Type_delete_attr_function) 0)
#define MPI_CONVERSION_FN_NULL ((MPI_Datarep_conversion_function) 0)
#define MPI_CONVERSION_FN_NULL_C ((MPI_Datarep_conversion_function_c) 0)
/* Deprecated predefined functions */
#define MPI_NULL_COPY_FN ((MPI_Copy_function) 0)
#define MPI_DUP_FN ((MPI_Copy_function) 1)
#define MPI_NULL_DELETE_FN ((MPI_Delete_function) 0) |
|
PS: You may have a similar problems in some |
ompi/mpi/c/abi_converters.h
Outdated
|
|
||
| __opal_attribute_always_inline__ static inline int ompi_convert_abi_ts_level_intern_ts_level(int ts_level) | ||
| { | ||
| if (MPI_THREAD_SINGLE_ABI_INTERNAL == ts_level) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a switch statement be better? It may ultimately be a matter of taste (optimizing compilers may generate similar binary code). Just a mild observation.
|
@hppritcha What's your plan for stuff introduced in MPI 4.1 and MPI 5.0? The generated mpi.h header says This is a list of what I managed to detect as missing from mpi4py's configuration machinery for missing MPI stuff. |
|
Current status regarding mpi4py:
|
|
Thanks for checking this stuff out in its early state @dalcinl. some of the above functions have been implemented but are sitting in various states in PRs. some of these should be defined - like We still need to add some plumbing in the ompi internals for callbacks. that will come in as part of this PR. The NERSC folks would really like ABI working for Doudna so to the extent there's a "plan" it would be nice to get this working sooner than later. |
|
Minor nit: Do you really need/want to install the |
|
| #define MPI_ABI_VERSION 1 | ||
| #define MPI_ABI_SUBVERSION 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set these to -1 if !OMPI_ABI_SRC? Or is this file only used in ABI scenarios?
If this file is only used in ABI scenarios, do we define MPI_ABI_[SUB]VERSION to -1 in some other file that is used in non-ABI scenarios? (I didn't find one, but I could have missed it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these should not be defined in the non-ABI header.
This is what the MPI standard says, though the wording is not great IMO.
The ABI version macros MPI_ABI_VERSION and MPI_ABI_SUBVERSION are present in
the MPI header and modules so that applications can check for consistency between the
compilation environment and the properties of the implementation at runtime.
# define MPI_ABI_VERSION 1
# define MPI_ABI_SUBVERSION 0The idea is that users can do conditional compilation
#if defined(MPI_ABI_VERSION)
// code that assumes being compiled under the MPI ABI
#endifThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But MPI-5.0 p844:23-24 says:
MPI_ABI_GET_VERSION produces the standard ABI version, if supported. Other-
wise, the values of the major and minor version are set to−1.
Shouldn't the same be true for the compile-time constants MPI_[SUB]VERSION?
I.e., your example conditional compilation should be:
#if MPI_ABI_VERSION > 0
// code that assumes being compiled under the MPI ABI
#endifThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe a more complete conditional compilation could be:
#if defined(MPI_ABI_VERSION) && MPI_ABI_VERSION > 0
// code that assumes being compiled under the MPI ABI
#endifOr:
// If MPI_VERSION is greater than 5, then MPI_ABI_VERSION must exist
#if MPI_VERSION >= 5 && MPI_ABI_VERSION > 0
// code that assumes being compiled under the MPI ABI
#endifThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the same be true for the compile-time constants
MPI_[SUB]VERSION?
Look, I think this was not the original intention. Nonetheless, you have a point, I'm not really opposed to your proposal. The only caveat is that the current MPI standard may not be absolutely clear about it, and most likely we need an errata entry to clarify this matter (whatever the final decision is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffhammond @hzhou @jsquyres Can we reach an agreement in advance about whether the MPI_ABI_VERSION/SUBVERSION macros should or should not go to the non-ABI mpi.h header?
IIRC, I originally advocated for these macros to NOT be defined in the non-ABI header, but there is API symmetry arguments to have them always defined (with {1,0} or {-1/-1} values for ABI and non-ABI respectively), so maybe we should consider adding them for the non-ABI header. The fix in MPICH and this PR is quite trivial and mostly inconsequential.
Once we reach an agreement between implementations, we can think of publishing an errata to the MPI standard document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abi.h.in is only used by the code in ompi/mpi/c and ompi/mpi/tool and is also used to generate ompi/mpi/c/standard_abi/mpi.h.
But i could see the use of defining them in the OMPI mpi.h.in as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay i don't think we should add this to the OMPI mpi.h till we discuss this in the forum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay discussion is happening. rather than fill this PR with mpi forum type comments please chime in on the forum issue cited above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i answered on mpi-forum/mpi-issues#1098 so i won't repost them here
I would, except that you keep commenting and throwing fear and uncertainty on what we are trying to accomplish.
Your claim is hard to disproof. You talk in circles, and never specifically state what exact rule are you following to update libtool's C/R/A tuple. At this point I have to suspect you do that to hide the fact that you are indeed breaking the rules.
You keep speaking in abstract terms, not specifying clearly how and what libtool objective I'm not following, nor what ABI's nature mean. I already stated many times: backward-compatible additions to the MPI API/ABI must bump both the current and age numbers of the c/r/a libtool tuple, and therefore shared library SONAME must not change. If this statement is incorrect, please indicate clearly and precisely how it is incorrect with proper references to published standards or documentation.
I cannot confirm that, because you have not yet stated clearly what's the alternative. Again: what are you proposing exactly? Are you arguing that every time the MPI standard add new features (handle types/values, functions) without making backward incompatible changes (like removing functions) then the MPI shared library should bump the SONAME from |
874bbd1 to
18f3b3e
Compare
48fc6b4 to
c888195
Compare
I confirmed your results. I can build MPI4PY using OMPI mpicc_abi, then switch LD_LIBRARY_PATH and PATH to use the MPICH ( at sha dcfab13774) and run both singleton and mpiexec variants of the test suite successfully. |
|
@hppritcha Did I earn a |
|
yes if i can correctly do co-authoring on commits! |
c888195 to
7e50e5d
Compare
|
there you go @dalcinl |
Two external MPI libraries are now created: libmpi.la and libmpi_abi.la. Backend code that was originally in libmpi.la has been extracted into libopen-mpi.la to be linked into both libraries. Parts of the Open MPI C interface are now being generated by a python script (abi.py) from modified source files (named with *.in). This script generates files for both the ompi ABI and the standard ABI from the same source file, also including new bigcount interfaces. To compile standard ABI code, there's a new mpicc_abi compiler wrapper. The standard ABI does not yet include all functions or symbols, so more complicated source files will not compile. ROMIO must be disabled for the code to link, since it's relying on the external MPI interface. Signed-off-by: Jake Tronge <[email protected]>
Implement lots of missing functionality in the original api.py script. The functionality is now part of the bindings generation framework used for Big Count. Number of todos still to do, in particular provide support for wrapping user supplied callback functions. Also, the sendrecv_replace, etc. code needs to be refactored to work with the bindings framework. to heed the --enable-abi-standard config option Additional initial fixes to merging ABI support into big count framework. Co-authored-by: Jake Tronge <[email protected]> Signed-off-by: Howard Pritchard <[email protected]>
This mod switches from using "synthetic" defined values and handles for the ones specified in the MPI 5.1 standard. The python infrastructure included in this PR for generating both a "canonical" abi standard compatible MPI 5.1 mpi.h using two json files: 1) mpi-standard-abi.json 2) mpi-standard-apis.json 2 is generated as part of building the MPI standard. We import that into our project for use in generating both the mpi.h as well as interface definitions in the man pages. 1 is generated using a separate script that processes the tables in Appendix A of standard. Ideally this script will be merged into the MPI standard code base at some point. This script is currently at https://github.com/Joe-Downs/mpi-standard/tree/pr/handle-constant-tool/const-tool . It is used to generate the portion of mpi.h where defined values and handles are specified. The converter functions that had been generated as part of the build out of the abi variants of the 'c' MPI interfaces are no define in persistent file. The methods defined in this file will be optimized to make use of the Huffman code characteristics of the predefined values in a subsequent PR. This commit also enables generation of the abi interfaces and header files by default. Signed-off-by: Joseph Downs <[email protected]>
7e50e5d to
ff738c4
Compare
|
hmm something on with aws ci. |
ff738c4 to
71da9a4
Compare
4110e99 to
9a585f3
Compare
Many fixes and improvements contributed by dalcinl. Add fortran support functions and some other functions missing from initial work. Return to generating converter functions as part of the build out of the abi variants. Signed-off-by: Howard Pritchard <[email protected]> Co-authored-by: Lisandro Dalcin <[email protected]>
irrespective of types of datatyps feed into non-blocking/persistent alltoallw. We'll replace this with something better in a separate PR. Signed-off-by: Howard Pritchard <[email protected]>
d7c885d to
f859c60
Compare
Implement the MPI-5 c ABI functionality.
Two external MPI libraries are now created: libmpi.so and libmpi_abi.so. Backend code that was originally in libmpi.la has been extracted into a libopen-mpi.la to be linked in to both libraries.
Almost all of the Open MPI C interface is now being generated from templates processed by the binding infrastructure.
The binding infrastructure introduced to support big count has been extended to generate code to support the MPI ABI specification. The bindings infrastructure is also used to generate the MPI ABI compliant mpi.h header file, as well as the converter functions to translate between OMPI and MPI ABI constants.
Before attempting to review this PR it would probably be good to become familiar with the MPI-5 ABI chapter and also to peruse the README ((ompi/mpi/README_ABI.md) included in this PR.
A few todos left:
todos for follow-up PRs:
MPI_Abi_set_fortran_infoandMPI_Abi_set_fortran_booleans.This PR supercedes #12033