Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI annotation option does not output any MPI information #3

Open
sheltongeosx opened this issue Jan 10, 2020 · 0 comments
Open

MPI annotation option does not output any MPI information #3

sheltongeosx opened this issue Jan 10, 2020 · 0 comments

Comments

@sheltongeosx
Copy link

Dear Nvprof developers:

I want to use nvprof to profile my cuda+mpi application. But the little test shows that the options --annote-mpi openmpi does not produce any information about MPI interface as described in the nvprof document. The following is the information of example for the test:

Sample Test:
From Link: http://geco.mines.edu/tesla/cuda_tutorial_mio/
Source Files: mpi_hello_gpu.cu, vecadd.cu
OpenMPI Version: 4.0.2
Cuda Version: 10.1
Command: $ mpirun -np 2 nvprof --annotate-mpi openmpi ./mpi_cuda

Output ( using 2 mpi processes):
rank 0 of 2 on p3dev02 received bcastme[3]=3 [gpu 0]
rank 1 of 2 on p3dev02 received bcastme[3]=3 [gpu 1]
==70253== NVPROF is profiling process 70253, command: ./mpi_cuda
==70254== NVPROF is profiling process 70254, command: ./mpi_cuda
rank 0: cudaGetDevice()=0
rank 1: cudaGetDevice()=1
rank 1: C[0]=0.000000
ranksum= 1
==70253== Profiling application: ./mpi_cuda
==70253== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 62.58% 3.1040us 2 1.5520us 1.3440us 1.7600us [CUDA memcpy HtoD]
37.42% 1.8560us 1 1.8560us 1.8560us 1.8560us [CUDA memcpy DtoH]
API calls: 86.74% 352.44ms 3 117.48ms 10.267us 352.42ms cudaMalloc
5.39% 21.910ms 582 37.645us 258ns 2.0794ms cuDeviceGetAttribute
4.75% 19.303ms 50000 386ns 303ns 102.73us cudaLaunchKernel
2.07% 8.3917ms 6 1.3986ms 1.1406ms 1.4661ms cuDeviceTotalMem
0.68% 2.7607ms 1 2.7607ms 2.7607ms 2.7607ms cudaGetDeviceProperties
0.34% 1.3713ms 6 228.55us 215.41us 247.59us cuDeviceGetName
0.02% 66.319us 3 22.106us 14.092us 30.931us cudaMemcpy
0.01% 20.708us 3 6.9020us 1.8690us 16.755us cudaFree
0.00% 12.278us 6 2.0460us 1.3700us 4.3850us cuDeviceGetPCIBusId
0.00% 7.5770us 12 631ns 375ns 973ns cuDeviceGet
0.00% 6.6190us 1 6.6190us 6.6190us 6.6190us cudaSetDevice
0.00% 6.2070us 4 1.5510us 867ns 2.3670us cuPointerGetAttributes
0.00% 2.3390us 6 389ns 354ns 461ns cuDeviceGetUuid
0.00% 1.8280us 3 609ns 437ns 780ns cuDeviceGetCount
0.00% 1.5210us 1 1.5210us 1.5210us 1.5210us cudaGetDevice
0.00% 1.2300us 1 1.2300us 1.2300us 1.2300us cudaGetDeviceCount
==70254== Profiling application: ./mpi_cuda
==70254== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 100.00% 179.83ms 50000 3.5960us 3.5510us 4.0640us vecAdd(float*, float*, float*)
0.00% 3.0400us 2 1.5200us 1.3440us 1.6960us [CUDA memcpy HtoD]
0.00% 2.0480us 1 2.0480us 2.0480us 2.0480us [CUDA memcpy DtoH]
API calls: 68.49% 884.64ms 50000 17.692us 16.647us 1.4335ms cudaLaunchKernel
28.85% 372.61ms 3 124.20ms 15.212us 372.57ms cudaMalloc
1.55% 20.003ms 582 34.368us 453ns 1.2518ms cuDeviceGetAttribute
0.76% 9.7675ms 6 1.6279ms 1.6077ms 1.6602ms cuDeviceTotalMem
0.25% 3.2029ms 1 3.2029ms 3.2029ms 3.2029ms cudaGetDeviceProperties
0.10% 1.2356ms 6 205.93us 135.78us 224.53us cuDeviceGetName
0.01% 103.42us 3 34.473us 19.464us 60.273us cudaMemcpy
0.00% 60.895us 3 20.298us 4.2420us 51.665us cudaFree
0.00% 16.364us 4 4.0910us 2.0370us 9.1220us cuPointerGetAttributes
0.00% 14.154us 6 2.3590us 1.9510us 3.1620us cuDeviceGetPCIBusId
0.00% 11.338us 12 944ns 580ns 1.5080us cuDeviceGet
0.00% 7.3840us 1 7.3840us 7.3840us 7.3840us cudaSetDevice
0.00% 3.8410us 6 640ns 592ns 673ns cuDeviceGetUuid
0.00% 2.7020us 3 900ns 699ns 1.0970us cuDeviceGetCount
0.00% 1.9360us 1 1.9360us 1.9360us 1.9360us cudaGetDevice
0.00% 1.2750us 1 1.2750us 1.2750us 1.2750us cudaGetDeviceCount

Hope you can reproduce the issue.

Best,
Shelton

@sheltongeosx sheltongeosx changed the title MPI annotion option does not output any MPI information MPI annotation option does not output any MPI information Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant