-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Embiggenment ("BigCount") #137
Comments
A pull request with specific text will be forthcoming. |
I see one immediate problem with this proposal. Currently, all MPI routines are explicitly defined as functions. This allows one to take the address of it to create a function pointer, for instance: int (*MY_Send)(const void *, int, MPI_Datatype, int, int, MPI_Comm) = &MPI_Send; A function-like macro does not support this use case, niche as it may be. There may be similar problems with the profiling interface as well, which is likely a greater concern. |
Additionally, there might be some strangeness around integer promotion rules and how they interact with |
Corollary of @omor1's comment: Can the function pointers proposal be composed as a library on top of current MPI definitions? |
@omor1 See the slides I presented this past Thursday at the MPI Forum meeting: https://github.com/mpi-forum/mpi-forum.github.io/blob/master/slides/2019/05/2019-05-30-BigCount-solutions-for-MPI-4.pdf . One direct point from there is that yes, we have to standardize all the underlying symbol names for PMPI reasons. But in short, yes, there are definite tradeoffs to every solution we have looked at. The user demand seems high to allow "big" count values, though. Hence, if the Forum wants to handle "BigCount" in C, tradeoffs will need to be made. Some points not explicitly covered in the slide (but we talked about verbally):
Are function pointers guaranteed to work? Interestingly enough, I just re-read the profiling section of MPI-3.1 and it does not guarantee that the back-end C symbols are actually the same as the names of the API functions (!). It just guarantees that name-shifted versions must be available (e.g., All that being said, FWIW, my testing with function pointers seem to work with C11 #include <mpi.h>
typedef int (*foo)(const void *buf, int count,
MPI_Datatype datatype,
int dest, int tag, MPI_Comm comm);
int bar(const void *buf, int count,
MPI_Datatype datatype,
int dest, int tag, MPI_Comm comm)
{
printf("I am in bar!\n");
return 0;
}
int main(int argc, char **argv)
{
char buffer[SIZE];
foo new_func = bar;
MPI_Init(NULL, NULL);
printf(">> The following functions should call MPI_Send (with int params)\n");
int i = SIZE;
MPI_Send(buffer, i, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
MPI_Send(buffer, 32, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
new_func(buffer, 32, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
// ... This compiles and runs properly on my MacOS 10.14.5 laptop (i.e., when I call |
Blah -- my original example was flawed. Here's what should be a correct one: #include <mpi.h>
typedef int (*foo)(const void *buf, int count,
MPI_Datatype datatype,
int dest, int tag, MPI_Comm comm);
int main(int argc, char **argv)
{
char buffer[SIZE];
foo bar = MPI_Send;
MPI_Init(NULL, NULL);
printf("Calling function pointer\n");
bar(buffer, 32, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
// ... The If we mandate the symbols for the |
Right, if you define the int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);
int MPI_Send_x(const void *buf, MPI_Count count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm);
#define MPI_Send(buf, count, datatype, dest, tag, comm) \
_Generic((count), \
default: MPI_Send, \
int: MPI_Send, \
size_t: MPI_Send_x \
)(buf, count, datatype, dest, tag, comm)
int main(int argc, char *argv[])
{
int buffer[500];
int count1 = 500;
MPI_Count count2 = 500;
MPI_Init(NULL, NULL);
// call MPI_Send
MPI_Send(buffer, count1, MPI_INT, 0, 0, MPI_COMM_WORLD);
// call MPI_Send_x
MPI_Send(buffer, count2, MPI_INT, 0, 0, MPI_COMM_WORLD);
int (*My_Send)(const void *, int, MPI_Datatype, int, int, MPI_Comm) = MPI_Send;
int (*MY_Send_x)(const void *, MPI_Count, MPI_Datatype, int, int, MPI_Comm) = MPI_Send_x;
// ...
} |
Can this be solved by a "big count" binding -- e.g. I am having a hard time imagine the need for an application to require access to both versions; therefore, exposing both versions to the user seems just more complicated than necessary. In fact, I think the implementation should just all move to |
To think about it, the function pointer solution is very similar to the binding solution. One is runtime binding, the other is static binding. I personally always preferred static solutions as they are simpler (to the user). |
Jeff... to paraphrase... So it is correct that taking the address of MPI_Send() is not in any way guaranteed ; only the symbol PMPI_Send is normative ... the invocation of MPI_Send() is normative ... we have no promise to allow the example shown to work ; right ?
The invocation can be a Macro that then dereferences a function pointer table or even just a macro to another name like You suggest ...
It is legal for mpicc to preprocess further or is it not ?
|
Please split this into three tickets:
You will find that 3 is much harder than you think at first glance. C11 Please also reconsider your position on displacements. If you do not fix displacements when you fix counts, you will end up with a half-broken, often useless interface. Please read the BigMPI paper if necessary. |
Also, your desire to not support |
Re @hzhou, if the interpretation that MPI never supported any of the non- Any Since MPI never defined an ABI in the first place (e.g. under an ILP64 system then current MPI allows for 'big counts') and as far as I can tell API breakage would be minimal, I see no reason why it couldn't work. |
Jeff, I wasn’t at the meeting but I totally don’t understand why the group decided to exclude displacements ... I would be inclined to vote against a solution that omits displacements . It seems so shortsighted .
I saw this and I am still confused. That seems like a big step backwards after a year and never came up before the meeting.
As for the rest of the bindings... let’s let Martin Ruefenacht, Puri Bangalore and Jeff Sauures explain the rationale ... we have all certainly been experimenting more with the C11 generics. And they more than I.
In future, our goal is not, however, to force other languages to just use the C89 bindings and so syntactic sugar ... we are planning to do C++ again and python in particular... .but make those first class language interfaces for C++, python etc ... and not make it so they have to just use C89 inside any MPI implementation .
I suppose one can argue that whenever there is a language that doesn’t get its specific language binding added it still needs the older C bindings ??!! That, however, is a new requirement not posed before ... why not use new C++ bindings (coming) or newer C bindings (proposes) ?
Regards,
Tony
Anthony Skjellum, PhD
205-807-4968
… On Jun 2, 2019, at 11:34 PM, Jeff Hammond ***@***.***> wrote:
Also, your desire to not support MPI_Send_x and friends is misguided, because you have to define those symbols for C11 _Generic to work. Furthermore, MPI has support for many languages -- Python, Rust, D, etc. -- and does so via the C89 bindings. If you attempt to hide explicit large-count symbols, you will make large-count support in these language interfaces impossible.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hi, I’d argue that programs that don’t include mpi.h are erroneous ... we don’t define behavior of erroneous programs :-)
So that discussion is not a concern to me at least.
All discussion of guaranteed symbols from an MPI interface are indeed moot ; only what the standard says ... there is no guarantee of this behavior ... and we shouldn’t go in that direction now...
you can always wrap anything you want in a function pointer interface .. in C++ we want to fancy versions of these for instance with fast delegates... story for later with our forthcoming proposal for a Language interface for python and C++ vs language binding on top of C89 bindings.
Actually: Anything that takes us there (requiring specific C symbols) now should be voted against in my opinion ... allowing even that the standard requires MPI to be a library should probably be eliminated instead as a contrariwise choice... MPI is the programmer notation ... it could be a compiled DSL or source to source translator ... for instance ... just have a smarter mpicc preprocessor plus backend library in future. That is not without controversy :-)
But, in that context, the PMPI interface probably gets a new look too to generalize and allow it to be hooked on without forcing all MPI implementations to interpose an abstraction barrier right underneath each programming language’s natural language interface ...
Also: Jeff H’s point about needing a common API in C for broad support of N other language interfaces is not the same as being able to guarantee symbol names. Supporting other languages is good and worthwhile of course ... maybe it’s orthogonal to your points on this thread .. just making sure .
Regards!
Tony
Anthony Skjellum, PhD
205-807-4968
… On Jun 3, 2019, at 12:59 AM, Omri Mor ***@***.***> wrote:
Re @hzhou, if the interpretation that MPI never supported any of the non-PMPI functions as explicit symbols, then there should technically be little backwards compatibility issues unless someone explicitly declares int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm); without including mpi.h (somehow).
Any int-sized parameters should undergo an implicit integer conversion to MPI_Count. The only backwards-compatibility problem I can see is if someone were to take the address of the symbol to obtain a function pointer—which as discussed above might not technically be valid according the standard in any case.
Since MPI never defined an ABI in the first place (e.g. under an ILP64 system then current MPI allows for 'big counts') and as far as I can tell API breakage would be minimal, I see no reason why it couldn't work.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
A first cut of Froozle MPI, a no-op MPI implementation just to experiment with MPI Forum issue #137 (mpi-forum/mpi-issues#137). This commit includes a first cut of C and C++ APIs; Fortran is still to come. Signed-off-by: Jeff Squyres <[email protected]>
A first cut of Froozle MPI, a no-op MPI implementation just to experiment with MPI Forum issue #137 (mpi-forum/mpi-issues#137). This commit includes a first cut of C and C++ APIs; Fortran is still to come. Signed-off-by: Jeff Squyres <[email protected]>
@tonyskjellum wrote:
You'll probably get plenty of critique, but if you want an extra pair of eyes on the C++ binding proposal, I'll be happy to oblige. |
Sure:-)
|
Hi, I’d argue that programs that don’t include mpi.h are erroneous ... we don’t define behavior of erroneous programs :-)
Please cite the text of MPI 3.1 that mandates using the header and disallows using legacy C semantics that assume undeclared functions return int, which is correct for almost all MPI functions.
|
As far as I can tell, the MPI Standard never specifies a particular version of the C standard, only ever referring to "ISO C", which can variously refer to any version of ISO/IEC 9899, the most recent version (ISO/IEC 9899:2018), or the version originally ratified (ISO/IEC 9899:1990), depending on interpretation. In particular, I believe that assuming that function calls with an undeclared identifier are declared as You'd still have the problem that the various MPI types (e.g. |
I have created Froozle MPI, a no-op MPI implementation that shows one possible way that the proposals in this issue could be implemented. The intent for Froozle is to ground the discussions on this issue (and the eventual PR) in with some code, and provide a basis for testing, validation, etc. I just released v0.5 (Froozle's first release). Comments welcome. |
>> Hi, I’d argue that programs that don’t include mpi.h are erroneous ... we don’t define behavior of erroneous programs :-)
> Please cite the text of MPI 3.1 that mandates using the header and disallows using legacy C semantics that assume undeclared functions return int, which is correct for almost all MPI functions.
Jeff, We need to add it if it’s not there ... are you arguing here because you know it is done in practice given absence of such a standard document statement ?
For instance, I am not aware of how to compile a hello world MPI program successfully without the header. Can you do that ?
|
$ gcc -std=c89 -c no-header.c && echo SUCCESS
SUCCESS int main(int argc, char* argv[])
{
int rc;
rc = MPI_Init(&argc,&argv);
rc = MPI_Finalize();
} Obviously, I'd have to cheat if I want to compile with |
@jeffhammond Honestly, that is just a straw-man example. Any attempt to actually use MPI functions without mpi.h would require the user to declare MPI symbols. I agree with @tonyskjellum in saying that, even though it isn't explicit in the standard, would make an erroneous program. Example: A user could do something like declare: typedef int MPI_Comm;
extern MPI_Comm MPI_COMM_WORLD;
extern int MPI_Barrier (MPI_Comm comm): and it would only work with one of the two primary implementations. This is totally erroneous and if any user actually does that and we break their code so what? We shouldn't be wasting our time thinking about that case. If we need to update the text to say including mpi.h is required then lets do that. |
We should change the standard to make We had a code that declared a user communicator as an |
For those used to untyped languages, using auto is fairly natural - and it gets you one step further. It still feels like a straw-man, though. int main() {
auto myComm;
auto myRank, mySize;
MPI_Init();
MPI_Comm_get_parent(&myComm);
MPI_Comm_rank(myComm, &myRank);
MPI_Comm_size(myComm, &mySize);
if (0==myRank) {
} else {
}
MPI_Finalize();
return 0;
}
|
Yes, exactly re types, so someone please show me how to make a correct, portable MPI program in C without the header :-)
No MPI call that communicate pt2pt or collective lacks a communicator ; so either MPI_COMM_WORLD is defined or you pass in a communicator to it ...
With FORTRAN, using permissive implicit definitions and/or its integer-for-handle API may be you can make subprograms... can you make hello world in FORTRAN without naming MPI_COMM_WORLD.
More importantly: leaving off the header should be allowed in MPI-4 explicitly even if we can figure a way to do it. We should make a ticket to that effect. We will have a supermajority of votes to pass it. I’ll ask Puri about this.
Regards
Tony
Anthony Skjellum, PhD
205-807-4968
… On Jun 4, 2019, at 6:30 AM, Omri Mor ***@***.***> wrote:
Please cite the text of MPI 3.1 that mandates using the header and disallows using legacy C semantics that assume undeclared functions return int, which is correct for almost all MPI functions.
As far as I can tell, the MPI Standard never specifies a particular version of the C standard, only ever referring to "ISO C", which can variously refer to any version of ISO/IEC 9899, the most recent version (ISO/IEC 9899:2018), or the version originally ratified (ISO/IEC 9899:1990), depending on interpretation. In particular, I believe that assuming that function calls with an undeclared identifier are declared as extern int identifier(); is deprecated since C99. Though of course all major compilers will support it as an extension (and warn about it).
You'd still have the problem that the various MPI types (e.g. MPI_Comm at the most basic) are undeclared, so I don't know how you'd be able to use them.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The second vote for this at the August, 2020 meeting was postponed to the September, 2020 meeting. |
I am going to start sending bills for anti-anxiety medication to the MPI Forum 😒 |
@jeffhammond in which case, don't ... I repeat DON'T ... look at the Slack channel right now. |
First, https://en.wikipedia.org/wiki/Streisand_effect :-) Second, I looked and am not concerned. It looks like a bunch of smart people are working hard to do straightforward things that need to be done. I did not find any showstopper issues... |
Question: does this force us to introduce an Embiggened operator, MPI_OP_C, to distinguish between MPI_OP that represents a user-defined operator function with INTEGER :: len and MPI_OP_C that represents a user-defined operator function with INTEGER(KIND=MPI_COUNT_KIND) :: len? This would mean that MPI_REDUCE would take MPI_OP whereas MPI_REDUCE_C would take MPI_OP_C. This is not needed by Fortran to distinguish MPI_REDUCE from MPI_REDUCE_C (because of the count parameter, which is TKR different) but does expressing all operators as MPI_OP cause implementation issues deep inside somewhere? I can imagine that MPI_OP might be an integer index into an internal array of operators, i.e. an array of function pointers. However, Fortran is strongly-typed so can such an array be declared correctly so that it can contain function pointers with different signatures? To be clear, I really don’t want to introduce MPI_OP_C (because built-in operators!) but I think we need a proof-of-concept implementation to reassure us that this is not made necessary by Fortran’s strong-typing rules. ** To be discussed at the virtual meeting today ** |
Hi,
I’m not sure whether I have understood the question correctly.
No, it doesn’t force an introduction of MPI_OP_C. MPI functions know by C functions usage MPI_Op_create or MPI_Op_Create_l how the user defined C function has to be used. Same is true for the Fortran user defined functions if separate Fortran functions
MPI_Op_create or MPI_Op_Create_l are available.
Sorry for missing the virtual meeting. But I looked yesterday into the list of virtual meetings and couldn’t find this meeting.
Hubert
|
This failed a no-no vote on 2020-09-28. |
Sept-2020 Meeting detailed list: |
This passed a second vote on 2020-09-30. https://www.mpi-forum.org/meetings/2020/09/votes The forum agreed to resolve all remaining issues as an errata vote at the December 2020 meeting. |
@wgropp To be clear: no PR should be merged in its current state in response to the successful final vote for this issue. This 2nd vote accepted commit 826f32a (https://github.com/mpi-forum/mpi-standard/pull/132/commits/826f32a47a1a15e4ace31609ab268e04bbca45b7), which is in the history of PR 132 (https://github.com/mpi-forum/mpi-standard/pull/132) but is not the HEAD of the branch behind that PR. I also created PR 294 (https://github.com/mpi-forum/mpi-standard/pull/294) to point directly to this voter-for commit, for easier reference. The proposed errata for the December meeting is represented by PR 268 (https://github.com/mpi-forum/mpi-standard/pull/268), which is rebased from a different point in the mpi-4.x history and is difficult to compare directly to PR 132. All changes in PR 132 (including those beyond the voted-for commit) are included in the errata PR 268 along with additional fixes that were the subject of the no-no vote that failed in the Sept 2020 meeting and will be the subject of the errata proposal in the Dec 2020 meeting. The plan is to merge these voted-for changes and the to-be-voted-for errata atomically by merging PR 268 once the errata vote has also passed. The intent is to iterate on the errata until it passes (or until the Forum decides to reverse the final vote for this issue). Recent comments and merge conflicts are still TBD and will be addressed on PR 268. |
@dholmes-epcc-ed-ac-uk thanks for the clear statement - I will leave this one, and will work with @wesbland as discussed at the Forum meeting. |
The Embiggening(tm) is merged into mpi-4-rc! https://files.slack.com/files-pri/TM2LH9685-F01BWJYMRE0/done.gif |
Problem
Modern applications want to use more than (2^31-1) count arguments across the MPI Standard.
Large counts are possible, but users do not like constructing them with MPI datatypes. (#80)
Proposal
The high-level intent of this proposal is to allow users to call functions such as
MPI_SEND
with either "small" counts (e.g.,int
in C/C++ andINTEGER
in Fortran) or "large" counts (e.g.,MPI_Count
in C/C++ andINTEGER,KIND=MPI_COUNT_KIND
in Fortran) -- regardless as to whether the count parameter is IN, INOUT, or OUT -- and have MPI "do the right thing".It will specifically be allowed that
MPI_Count
is actuallyint
(e.g., for implementations or environments that do not support large counts as prescribed in this proposal).Specifically: we do not want to expose alternate symbol names (e.g.,
MPI_SEND_X
) to applications -- they should continue to use names such asMPI_SEND
, and the language/MPI implementation will select the right back-end implementation depending on the type of the count parameter(s).More detail
Providing both int and MPI_Count parameters for all relevant MPI functions in will use different mechanisms in C, C++, and Fortran.
_Generic
_Generic
mechanism because the C++ language does not support_Generic
mpi_f08
module to includeMPI_Count
functionality. Applications usingmpif.h
or thempi
modules will need to update to (perhaps piecemeal) use thempi_f08
module to get access toMPI_Count
-enabled functions.Note that MPI-3.x supports C++ only indirectly: the C bindings are carefully crafted such that they work in both C and C++. This proposal will intentionally add a C++-specific mechanism (function overloading). Other than that, there will be a 1:1 global function set compared to the C bindings.
Note: we are electing to create the
mpi.hpp
header to the standard for use in C++ MPI applications. This is in recognition that C and C++ are continuing to diverge as languages, and we may need further separation between the C and C++ interfaces someday.Also note that this proposal does not change the displacement parameters in v or w functions -- specifically because displacements and counts are separate concepts.
Changes to the Text
This will cause extensive changes across much of the existing standard. Conceptually:
MPI_Count
types.MPI_SEND
now have two versions.Impact on Implementations
Implementations will have to provide the
mpi.hpp
header for C++ usage. Additional glue code will be required to support both MPI_Count and int interface level functions.Tools will be required to support the new MPI symbols introduced by this proposal.
Impact on Users
The impact on users is kept to the minimum;
mpi.hpp
instead ofmpi.h
MPI_Send
)MPI_Send
). However, if C++ applications recompile with a_Generic
-enabledmpi.h
, they will need to update their source code to includempi.hpp
, and then they will be switched to use C++ symbols (i.e., a C++-compiler-munged version ofMPI_Send
)mpi_f08
module).MPI_Count
for count parameters, but they will need to update their source code.References
Other tickets referring to this issue in some way or another: #80, #97, #98, #99, #100, #105, #108
Pull request: https://github.com/mpi-forum/mpi-standard/pull/268
(old pull request: https://github.com/mpi-forum/mpi-standard/pull/132 )
PDF file that passed first vote: mpi40-report-826f32a-ticket137-14jun20.pdf
The latest PDF file: mpi40-report-ticket137-cfed9a5-14Sep2020.pdf
PDF file with changes since first vote highlighted: mpi40-report-ticket137-cfed9a5-14Sep2020-highlighted.pdf
Please also see Froozle MPI, a no-op MPI implementation that shows one possible way that the proposals in this issue could be implemented.
The text was updated successfully, but these errors were encountered: