-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Big MPI---point-to-point considerations (MPI_Rank only) #97
Comments
I vigorously object to Furthermore, you need to propose new versions of a large number of functions that have nothing to do with large-count support. Below is a partial list.
Please make the If somebody builds a system that needs more than needs more than 2147483648 ranks, it is not unreasonable to expect them to move to ILP64 such that |
Honestly, You really need to consider how badly you want to break every piece of MPI software in the world today and if the nonsensical possibility of a machine that effectively supports more than 2147483648 ranks is worth it. MPICH/* The order of these elements must match that in mpif.h, mpi_f08_types.f90,
and mpi_c_interface_types.f90 */
typedef struct MPI_Status {
int count_lo;
int count_hi_and_cancelled;
int MPI_SOURCE;
int MPI_TAG;
int MPI_ERROR;
} MPI_Status; Open-MPIstruct ompi_status_public_t {
/* These fields are publicly defined in the MPI specification.
User applications may freely read from these fields. */
int MPI_SOURCE;
int MPI_TAG;
int MPI_ERROR;
/* The following two fields are internal to the Open MPI
implementation and should not be accessed by MPI applications.
They are subject to change at any time. These are not the
droids you're looking for. */
int _cancelled;
size_t _ucount;
};
typedef struct ompi_status_public_t ompi_status_public_t; |
You will also need to revise the matching rules for how this works when users send using |
Jeff, we will consider that. Process: Martin specifically asked that we consider the MPI_Rank option. We will specifically split the vote in a way to allow the forum to accept/reject this change for MPI-4. It was pointed out that endpoints and GPU-like devices and fine-grain accelerators could yield > 2^31 ranks in a communicator. Technical: The problem with the heterogeneous use of the APIs will have to be fully considered like you're saying. Seems like, to allow this, protocols will carry an extra 32-bits of rank space. (Not my favorite answer, that's my straw answer. That means a tax on current performance.) |
How about if I split this ticket now --- a) MPI_Count + miscellaneous; b) MPI_Rank ? |
There are several constants that can be passed through rank arguments -- e.g. |
I maintain my view that we should just break backwards compatibility in MPI-4.0. Yes, this will require a period of time where MPI implementors have an MPI-3.x release and an MPI-4.x+ release but it would be worth it to avoid having _x and _with_info versions all over the place. |
@hjelmn Note that this will not only break ABI compatibility but user code that uses As out-of-bounds array accesses are undefined behavior in C, your proposal not only breaks applications in practice but also causes them to violate the base language in which they are written. The only reasonable thing to do here is expect ILP64 support if more than 2Bi ranks are required. |
Any tickets proposing to add |
MPI does not define ABI compatibility so I am not concerned about that. |
Breaking user code is the bigger issue. Please describe how you’ll address that.
|
The way I see it, if we break API user will have to modify their code for MPI-4.0. All the changes will be simple to make but will take some work. Thats why I imagine that a high-quality implementation will provide an MPI-3.x layer during some transition period. |
So you want implementations to support two complete MPI APIs in parallel? You think maintaining passive target is a burden but the int and MPI_Rank APIs aren’t?
|
Not at all. My preference is that we will continue to support an older version for a little longer than usual and drop the MPI-3.x API in the new releases. This happens all the time in the software world. MPI's API has issues. We should fix it the right way now and be done with it. None of this _foo nonsense. |
2^31+1 ranks is a stupid reason to break MPI. This is a hill on which I am prepared to die.
|
Not even remotely saying we break API for ranks. Just saying if we break it because of info, counts, etc might as well change ranks as well. |
I am closing this ticket for now, it is highly controversial and it will distract from the rest of the Big MPI activities. |
Problem
For 64-bit clean functionality, convenience, and symmetry, the Big MPI principles being applied in Ticket #80 to collective operations should be applied to MPI more widely. In this case, we consider the idea that you might want more than 2^31 MPI ranks, hence needing a new data type, MPI_Rank.
Proposal
MPI needs to be 64-bit clean throughout.
Changes to the Text
MPI_Rank will replace int for ranks; support > 2^31 MPI processes in a communicator.
A separate ticket considers MPI_Count and miscellaneous concerns for point-to-point.
Impact on Implementations
No current API is impacted. New _X APIs for all point-to-point operations affected will be needed.
MPI implementations will have to be 64-bit clean inside since count*extent > 2^31 is already problematic for some implementations. New APIs will have to be added and the internals of MPI will have to be 64-bit capable for buffers and related issues.
Impact on Users
Users who opt in with the new API will be able to have communicators larger than 2^31. [MPI_Rank]
References
See also Ticket #80, #98, #99, #100
The text was updated successfully, but these errors were encountered: