Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate or fix MPI_COMM_JOIN #13

Open
jeffhammond opened this issue Nov 9, 2015 · 8 comments
Open

deprecate or fix MPI_COMM_JOIN #13

jeffhammond opened this issue Nov 9, 2015 · 8 comments
Assignees
Labels
mpi-5 For inclusion in the MPI 5.0 standard wg-collectives Collectives Working Group

Comments

@jeffhammond
Copy link
Member

This was https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/301. Since there is no obvious working group for this, I am putting it here.

Background

At the October 2011 meeting (10/26/2011), Bill said that interaction with an explicit externally specified environment like POSIX was totally inappropriate for the MPI standard. On this basis, and the lack of demonstrated necessity for such a routine, we should deprecate MPI_COMM_JOIN.

MPI_COMM_JOIN explicitly refers to the externally specified protocol known as Berkeley Sockets. It is not compatible with other implementations of sockets due to the choice of type for the socket file descriptor (integer); for example, MPI_COMM_JOIN is incompatible with Windows Sockets (source: Fab Tillier). I do not see how a *nix-specific routine is no more inappropriate for the MPI standard than a POSIX-oriented MPI_FILE_STAT one.

Here is the relevant excerpt from http://www.lam-mpi.org/MailArchives/lam/2001/09/3315.php as to why this routine is probably superfluous (see paragraph 2):

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2001-09-21 08:24:50

> I am trying to understand the command MPI_Comm_join.

This is a fairly specialized function that was added to MPI-2 for whacky 
configurations that the MPI Forum couldn't predict. It is intended to 
take a file descriptor as an argument that represents another MPI process 
(where that file descriptor can be a pipe or a socket or some other IPC 
mechanism), and create a communicator between the two processes. You will 
need to create this file descriptor yourself -- it is intended to *only* 
be used by the MPI_Comm_join call. 

What do you need to use MPI_Comm_join for? You may wish to explore 
MPI_Comm_connect and MPI_Comm_accept instead -- they may provide an easier 
way to connect to previously unrelated MPI programs since there's no need 
for anything outside of the scope of MPI (i.e., a file descriptor). 

Proposal

We propose to redefine the current interface in such a way as to not break backwards compatibility on platforms that currently support this function while enabling those that currently cannot support it to do so in the future. Backwards compatibility on currently supporting platforms requires MPI_Socket to be defined to an integer file descriptor. On systems such as Windows that currently do not support this function, MPI_Socket can be defined to be the appropriate object without introducing a regression.

MPI_COMM_JOIN(fd, intercomm) 
   IN     fd                    socket file descriptor
   OUT    intercomm             new intercommunicator (handle)

int MPI_Comm_join(MPI_Socket fd, MPI_Comm *intercomm)

MPI_Comm_join(fd, intercomm, ierror) BIND(C)
    TYPE(MPI_Socket), INTENT(IN) ::  fd
    TYPE(MPI_Comm), INTENT(OUT) ::  intercomm
    INTEGER, OPTIONAL, INTENT(OUT) ::  ierror

MPI_COMM_JOIN(FD, INTERCOMM, IERROR)
    INTEGER FD, INTERCOMM, IERROR
Advice to implementers: In order to preserver backwards compatibility, 
MPI_Socket should be an integer file descriptor on platforms that currently 
support this function.  On platforms that do not support this function, there 
is no backwards compatibility issue, and MPI_Socket can be defined to be anything.
@schulzm
Copy link

schulzm commented Jun 13, 2018

June 2018 Meeting in Austin: forum decided to take this up again, for reading in BCN

@jeffhammond jeffhammond changed the title deprecate MPI_COMM_JOIN deprecate or fix MPI_COMM_JOIN Jun 13, 2018
@tonyskjellum tonyskjellum added scheduled reading Reading is scheduled for the next meeting and removed not ready labels Jun 14, 2018
@tonyskjellum
Copy link

We've decided to review, update, and read this ticket in Barcelona meeting.

@jeffhammond
Copy link
Member Author

@tonyskjellum In favor of deprecation or fixing? I think fixing is the better path, since the only implementation that needs to do anything besides typedef int MPI_Socket is MS-MPI.

@tonyskjellum
Copy link

Hi, I'd like to get this for reading in December...

@schulzm : what is needed for a deprecation request and reading?

@dholmes-epcc-ed-ac-uk
Copy link
Member

Thought (not fully formed): is MPI_Socket actually a port? In the sense of MPI_OPEN_PORT. That would harmonise connect/accept with join. It also suggests we should discover and clearly state what the differences are between these semantics, if any.

The implementation could still use a socket internally but just present a string name to the user.

The MPI_OPEN_PORT function takes an INFO object, so the user could hint that they would like the port to be backed up by a socket, and even provide an IP address for the intended target. OTOH, the user could use the INFO to assert that they will use the port in MPI_COMM_JOIN_X(port, intercomm). MPI can respond to that by failing to provide a port if join if not supported, or by creating a socket (if that is needed by its chosen implementation of join), or by preparing any other communication mechanism (if it can support join in a different/better way than via a socket).

Benefits: remove direct reliance on Berkley Sockets from the MPI Standard; harmonise the API for connect/accept and join; permit different/better implementations of join.

Problems: more work for implementors (many of whom are not convinced of the need for join in the first place); does not fix the existing API (creates a new API to avoid breaking all the non-existent things that use the current join API).

Difference to connect/accept: join implicitly uses MPI_COMM_SELF as the local group, whereas connect/accept use the communicator provided as an argument.

@tonyskjellum
Copy link

tonyskjellum commented Sep 26, 2018 via email

@tonyskjellum
Copy link

tonyskjellum commented Sep 28, 2018 via email

@hppritcha
Copy link

This topic was again discussed briefly in the 9/28/20 (v) MPI forum while reading mpi-forum/mpi-standard#269

@wesbland wesbland added mpi-5 For inclusion in the MPI 5.0 standard and removed mpi <next> labels Jul 21, 2021
@wesbland wesbland removed the scheduled reading Reading is scheduled for the next meeting label Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mpi-5 For inclusion in the MPI 5.0 standard wg-collectives Collectives Working Group
Projects
Status: To Do
Development

No branches or pull requests

8 participants