-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coll: internally use MPI_Aint for count parameters #5044
Conversation
1465ff5
to
5c7c6be
Compare
test:mpich/ch3/most |
@raffenet I am going to make the counts array argument (e.g. MPI_Gatherv) internally to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's review the arrays separately. This one LGTM.
As I continue to work on counts array, I realzed that missed a ton of collective functions inside the ch4 since we have entire sets coll netmod/shmmod APIs as well. So this PR is incomplete even for just scalar count, and it reveals that without strict compiler check (which we can't due to unclean main) there is no easy way to tell whether the PR is complete. Let me continue and do the largification of counts array as well. That will help me identify all the places. Then if we decide to only do scalar, we can easily just drop the counts array commit. |
df18568
to
639cce8
Compare
test:mpich/ch3/most |
a7fce7e
to
d5d8a68
Compare
test:mpich/ch3/most |
d5d8a68
to
acf0f86
Compare
test:mpich/ch3/most |
acf0f86
to
3b1ac74
Compare
test:mpich/ch4/most |
test:mpich/ch3/most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still prefer evaluating the v-collective changes separately. The rest of the PR looks pretty good.
src/mpid/ch4/netmod/ucx/ucx_win.c
Outdated
@@ -68,9 +68,14 @@ static int win_allgather(MPIR_Win * win, size_t length, uint32_t disp_unit, void | |||
MPIDI_UCX_CHK_STATUS(status); | |||
} | |||
|
|||
MPI_Datatype aint_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just use MPI_AINT
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from that, I don't really understand what changed here. rkey_sizes
is the data. It shouldn't have anything to do with large count interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see it now. It is for the followup collective using Allgatherv. This makes sense now, but we should still just use MPI_AINT
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I suppose all this is tied to the v-collective promotion, since that is the commit that actually promotes the rkey_sizes
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see it now. It is for the followup collective using Allgatherv. This makes sense now, but we should still just use
MPI_AINT
directly.
We are. The problem is MPI_Aint
is not a datatype, so we need create a MPI_Datatype
for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MPI_AINT
is a datatype.
Line 262 in e149d18
#define MPI_AINT ((MPI_Datatype)@MPI_AINT_DATATYPE@) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Yeah, I can use MPI_AINT
directly. I had a feeling I was missing something.
I can do that. |
7c1f947
to
8fc06c9
Compare
test:mpich/ch3/most |
Jenkins on mac went away. Retest: test:mpich/ch3/sock |
test:mpich/ch3/sock |
@raffenet This PR now only contains the scalar count conversions. I'll push a separate PR with v-counts. Could you review this PR again? |
The blocking neighbor collectives use slight different MPIR_Xxx conventions than the other blocking collectives. This seems to be an oversight rather than necessity. To facilitate large count changes, let's generate them in python autogen. TODO: make neighbor collective behave the same way as other collectives.
Both are automatically generated by the python scripts now. We may replace the count parameters with large int types. It is much easy to generate them than manually update them.
The op functions are not using large count yet. TODO: fix this.
8fc06c9
to
0801f66
Compare
Reference #4880 |
Pull Request Description
Make collective internally use
MPI_Aint
for count parameters.[skip warnings]
Expected Impact
Author Checklist
module: short description
and follows good practice