You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm wondering how to avoid this issue. Maybe this is a bug of libundumpi.
My analysis is the following:
"dumpi2otf -f dumpi-2017.11.22.12.28.10.meta -o TRACE.otf" returns the following lines (non-important lines are omitted):
MPI_Comm_split entering at walltime 79071.672523400, cputime 0.019312436 seconds in thread 0.
MPI_Comm oldcomm=2 (MPI_COMM_WORLD)
int color=0
int key=0 MPI_Comm newcomm=4 (user-defined-comm)
MPI_Comm_split returning at walltime 79071.672922008, cputime 0.019510788 seconds in thread 0.
MPI_Intercomm_create entering at walltime 79071.672924778, cputime 0.019513582 seconds in thread 0.
MPI_Comm localcomm=4 (user-defined-comm)
int localleader=0
MPI_Comm remotecomm=2 (MPI_COMM_WORLD)
int remoteleader=1
int tag=52 MPI_Comm newcomm=5 (user-defined-comm)
MPI_Intercomm_create returning at walltime 79071.766963076, cputime 0.064398238 seconds in thread 0.
MPI_Comm_free entering at walltime 79071.767282900, cputime 0.064599472 seconds in thread 0. MPI_Comm comm=4 (user-defined-comm)
MPI_Comm_free returning at walltime 79071.767291600, cputime 0.064608163 seconds in thread 0.
MPI_Comm_free entering at walltime 79071.767293619, cputime 0.064610199 seconds in thread 0. MPI_Comm comm=5 (user-defined-comm)
MPI_Comm_free returning at walltime 79071.767298083, cputime 0.064614646 seconds in thread 0.
MPI_Finalize entering at walltime 79071.767299452, cputime 0.064616031 seconds in thread 0.
MPI_Finalize returning at walltime 79071.771893553, cputime 0.066521156 seconds in thread 0.
After debugging, I think the problem is the function MPI_Comm_free that does not match the communicators created by MPI_Intercomm_create. dumpi2otf crashes in the function:
trace::commentry& trace::get_commentry(int commhandle) {
std::pair<commmap_t::iterator, commmap_t::iterator> it =
this->comms_.equal_range(commhandle);
if (it.first == it.second) // if no match was found.
throw "trace::get_commentry: Invalid communicator index."; /* OUR EXCEPTION */
commmap_t::iterator ele = it.second;
--ele;
return ele->second;
}
If we run the command "dumpi2veft -i dumpi-2017.11.22.12.28.10.meta -o TRACE.otf" the output is:
DEBUG: trace::handle_comm_free comm=4
DEBUG: success
DEBUG: trace::handle_comm_free comm=5
Error: trace::get_commentry: Invalid communicator index.
I see that comm 4 is freed, whereas comm=5 is not.
Anyone know what may be happening. Any solution? suggestions?
The text was updated successfully, but these errors were encountered:
Hi all, I'm playing with dumpi2otf and the example found at http://mpi.deino.net/mpi_functions/MPI_Intercomm_create.html, which crashes and issues the message "Error: trace::get_commentry: Invalid communicator index."
I'm wondering how to avoid this issue. Maybe this is a bug of libundumpi.
My analysis is the following:
MPI_Comm_split entering at walltime 79071.672523400, cputime 0.019312436 seconds in thread 0.
MPI_Comm oldcomm=2 (MPI_COMM_WORLD)
int color=0
int key=0
MPI_Comm newcomm=4 (user-defined-comm)
MPI_Comm_split returning at walltime 79071.672922008, cputime 0.019510788 seconds in thread 0.
MPI_Intercomm_create entering at walltime 79071.672924778, cputime 0.019513582 seconds in thread 0.
MPI_Comm localcomm=4 (user-defined-comm)
int localleader=0
MPI_Comm remotecomm=2 (MPI_COMM_WORLD)
int remoteleader=1
int tag=52
MPI_Comm newcomm=5 (user-defined-comm)
MPI_Intercomm_create returning at walltime 79071.766963076, cputime 0.064398238 seconds in thread 0.
MPI_Comm_free entering at walltime 79071.767282900, cputime 0.064599472 seconds in thread 0.
MPI_Comm comm=4 (user-defined-comm)
MPI_Comm_free returning at walltime 79071.767291600, cputime 0.064608163 seconds in thread 0.
MPI_Comm_free entering at walltime 79071.767293619, cputime 0.064610199 seconds in thread 0.
MPI_Comm comm=5 (user-defined-comm)
MPI_Comm_free returning at walltime 79071.767298083, cputime 0.064614646 seconds in thread 0.
MPI_Finalize entering at walltime 79071.767299452, cputime 0.064616031 seconds in thread 0.
MPI_Finalize returning at walltime 79071.771893553, cputime 0.066521156 seconds in thread 0.
If we run the command "dumpi2veft -i dumpi-2017.11.22.12.28.10.meta -o TRACE.otf" the output is:
DEBUG: trace::handle_comm_free comm=4
DEBUG: success
DEBUG: trace::handle_comm_free comm=5
Error: trace::get_commentry: Invalid communicator index.
I see that comm 4 is freed, whereas comm=5 is not.
Anyone know what may be happening. Any solution? suggestions?
The text was updated successfully, but these errors were encountered: