Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add large-count datatype constructors, etc. (Trac #468) #1

Open
jeffhammond opened this issue Oct 22, 2015 · 0 comments
Open

add large-count datatype constructors, etc. (Trac #468) #1

jeffhammond opened this issue Oct 22, 2015 · 0 comments

Comments

@jeffhammond
Copy link
Member

History

Working group feedback on Trac #423 motivates a new ticket to make large-count equivalents of "all" the datatype constructors, etc.

Overview

Since we don't support counts larger than INT_MAX, at least in the C interface, it would be nice to make it easy for users to work around this. And, lest there be any doubt, users are hitting this issue in real applications, e.g. NWChem.

It was originally proposed to only add a large-count contiguous type, but that is unappealing on many counts, because it breaks an obvious symmetry and looks like a bandage.

Usage

For example:

#ifdef NO_LARGE_COUNT_SUPPORT
    MPI_Bcast(stuff, not_large_count /* int */, MPI_BYTE, 0, MPI_COMM_WORLD);
#else // LARGE_COUNT_SUPPORT
    MPI_Datatype large_type;
    MPI_Type_contiguous_x(large_count /* MPI_Count */, MPI_BYTE, &large_type);
    MPI_Type_commit(&large_type);
    MPI_Bcast(stuff, 1, large_type, 0, MPI_COMM_WORLD);
    MPI_Type_free(&large_type);
#endif

I argue that most users would find this type of modification to their code relatively non-intrusive and a reasonable workaround for the large-count case.

Motivation

Unfortunately, using the existing MPI functions, construction a contiguous datatype for large counts is not very simple. Below is what I believe is the most efficient implementation of a general, contiguous large-count datatype. It requires six MPI function calls to do something that is trivial. I'm not arguing that function call overhead is a performance issue compared to shipping more than INT_MAX elements, but rather it is a productivity overhead to ask users to duplicate the following effort to be fully general and avoid overflow hazards.

#include <limits.h>
#include <mpi.h>
/*
 * Synopsis
 *
 * int MPIX_Type_contiguous_x(MPI_Count count,
 *                            MPI_Datatype   oldtype,
 *                            MPI_Datatype * newtype)
 *
 *  Input Parameters
 *
 *   count             replication count (nonnegative integer)
 *   oldtype           old datatype (handle)
 *
 * Output Parameter
 *
 *   newtype           new datatype (handle)
 *
 */
int MPIX_Type_contiguous_x(MPI_Count count, 
                           MPI_Datatype oldtype, 
                           MPI_Datatype * newtype)
{
    MPI_Count c = count/INT_MAX;
    MPI_Count r = count%INT_MAX;

    MPI_Datatype chunks, remainder;
    MPI_Type_vector(c, INT_MAX, INT_MAX, oldtype, &chunks);
    MPI_Type_contiguous(r, oldtype, &remainder);

    int typesize;
    MPI_Type_size(oldtype, &typesize);

    MPI_Aint remdisp          = (MPI_Aint)c*INT_MAX*typesize; /* must explicit-cast to avoid overflow */
    int blocklengths[2]       = {1,1};
    MPI_Aint displacements[2] = {0,remdisp};
    MPI_Datatype types[2]     = {chunks,remainder};
    MPI_Type_create_struct(2, blocklengths, displacements, types, newtype);

    MPI_Type_free(&chunks);
    MPI_Type_free(&remainder);

    return MPI_SUCCESS;
}

See type_contiguous_x.c for the implementation that avoids the user of MPI_Type_create_struct whenever an appropriate factorization can be found.

I propose to add the equivalent of the aforementioned function to the standard so that it can be implemented properly and save users the pain of having to use the following type of approach in their codes. While it can certainly be argued that users can just figure out how to solve the problem on their own, I don't believe it is an appropriate burden to put this burden on the user.

Implementation status

MPICH

An MPICH implementation is complete except for testing. It was essentially trivial except for adding all of the interface magic.

Here is my prototype implementation in MPICH. The essential commits to inspect are

Additional references

BigMPI is my project to support large-count interfaces for all of the MPI communication functions for which they make sense.

Query Functions

In order to query the datatype created by e.g. MPI_TYPE_CONTIGUOUS_X, we need the following functions.

int MPI_Type_get_envelope_x(MPI_Datatype datatype, int *num_integers, int *num_counts,
             int *num_addresses, int *num_datatypes, int *combiner);

int MPI_Type_get_contents_x(MPI_Datatype datatype, int max_integers, int max_counts
             int max_addresses, int max_datatypes, int array_of_integers[],
             MPI_Count array_of_counts[], MPI_Aint array_of_addresses[],
             MPI_Datatype array_of_datatypes[]);

The combiners to be added follow the naming structure MPI_COMBINER_CONTIGUOUS_X for MPI_COMBINER_CONTIGUOUS and so forth.

@jeffhammond jeffhammond changed the title add large-count datatype constructors, etc. (was Trac ticket 468) add large-count datatype constructors, etc. (Trac #468) Dec 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant