v4.1.x: Using package_rank to select between NIC of equal distance from the process#8176
v4.1.x: Using package_rank to select between NIC of equal distance from the process#8176rajachan merged 2 commits intoopen-mpi:v4.1.xfrom
Conversation
|
I had a few issues with applying this patch to 4.1.x. namely, the process_info_t is different in ompi_process_info and opal_process_info. opal_process_info doesn't contain the pid, which I was using as a fallback for the case where package_rank is unable to be calculated. In this case, I ended up using my_local_rank (which was the previous fallback). I'm also not sure what the protocol is when a cherry-pick cannot be cleanly applied? I just did 2 commits, but I can change this if a single working commit is preferred |
|
Thanks for keeping the commits separate for review purposes. Now that you have at least one approval, can you squash them so we don't break the ability to bisect commits and do functional checks. |
a53e69d to
fccfefe
Compare
|
oops picked up some commits, will push another |
fccfefe to
d9242bf
Compare
|
@rhc54 I noticed while testing that ompi_process_info.cpuset is not populated. I get the current process locality string from that, and it's causing some odd behavior. can I normally expect this to be populated, or is this something that I should check for and get from a PMIx call if it's not there? |
|
@dancejic How are you testing it? Looking at the code, it all looks to me like it is correct. The cpuset should be getting populated in orte_init when it asks HWLOC to fill in that field. |
|
@rhc54 Sorry for the late response, I'm trying to print out the cpuset in the mtl ofi: |
|
Just checked, and this is also happening on the v4.1.x branch without this PR on my instance. I think if the cpuset is expected to be filled in the usual case and this is just an issue with the instance I'm running on, this is good to merge and I'll just continue trying to figure out what's going on from my end. |
|
@dancejic I just checked on head of the v4.1.x branch and cpuset is indeed set when launching with The "bound to" message is created using the |
that looks like it's working for me, so I'm not sure why ompi_process_info.cpuset prints out null in the mtl ofi |
this is with the same print statement as earlier: |
|
Okay, let me investigate. I'm guessing that we fail to transfer it across to the ompi_process_info struct. |
|
I'm at a loss - everything appears to be set just fine. I can find no problems with the code and no place where the cpuset gets eliminated. Let me check with your branch (may take me a bit to build). |
|
I'm having trouble figuring this out too, do you have a hint where I should be looking? I checked in orte_init.c, but it looks like at that point it's already empty |
|
It gets set in orte_init - around line 270: /* initialize the RTE for this environment */
if (ORTE_SUCCESS != (ret = orte_ess.init())) {
error = "orte_ess_init";
goto error;
}The MPI layer defines Like I said, I find that the fields are correctly filled out - I can't find any problem. Unfortunately, I can't build your MTL to see what might be going on there, but everything checks out fine outside of that component. |
|
Okay, I am finally able to reproduce this: Let me see what happened. |
|
@dancejic I pushed a couple of commits that I believe will resolve the problem. Please take a look at them (you can ignore the first commit - there was a file added in the PMIx v3.2.1 update that didn't get ignored as it should). In brief, the problem was a combination of a couple of issues:
Hope that helps |
|
@dancejic Feel free to squash and modify as necessary/desired! |
|
Thank you Ralph! I appreciate the help on this. I think I was using cpuset because in the master branch I believe it is the locality string. I'm not sure if this is the intended value for it or if this should also reflect the naming and be the actual cpuset. from ompi_rte.c: |
|
I'll have to check the master branch as that is clearly incorrect. I'll also take a gander in v4.1.x to ensure other users properly interpreted it as the cpuset and not the locality - I believe they do (based on my earlier scan), but I'll double-check. |
|
Okay, I checked both v4.1.x and v4.0.x and we are okay. The only places actually using cpuset are in fact looking for the cpuset and not the locality string. I'll fix master. Thanks for bringing it to my attention! |
…om the process. If PMIX_PACKAGE_RANK is available, uses this value to select between multiple NIC of equal distance between the current process. If this value is not available, try to calculate it by getting the locality string from each local process and assign a package_rank. If everything fails, fall back to using process_id.rank to select the NIC. This last case is not ideal, but has a small chance of occuring, and causes an output to be displayed to notify that this is occuring. Some of the information in master branch is not available for the multi-NIC patch, such as myprocinfo.rank. This info is used to select between multiple NIC of equal distance to the process. This adapts the previous commit to work with the v4.1.x branch. Signed-off-by: Nikola Dancejic <dancejic@amazon.com> (cherry picked from commit 8017f12)
Ensure we always pass the cpuset as well as the locality string for each proc. Correct the mtl/ofi component's computation of relative locality as the function being called expects to be given the locality string of each proc, not the cpuset. If the locality string of the current proc isn't available, then use the cpuset if available and compute the locality before trying to compute relative localities of our peers. Signed-off-by: Ralph Castain <rhc@pmix.org>
c3cb4d7 to
ec35893
Compare
|
squashed it down to 2 commits, my original version with no locality check and your changes with the fixes to cpuset and getting the locality string. |
commit a53e69d (HEAD -> multi-v4.1.x)
Author: Nikola Dancejic dancejic@amazon.com
Date: Tue Nov 3 11:35:47 2020 -0800
commit cee4592
Author: Nikola Dancejic dancejic@amazon.com
Date: Thu Oct 22 19:18:28 2020 -0700