-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDA] Support IPC for allocations created by cuMemCreate
and cudaMallocAsync
#7110
Comments
cuMemCreate
cuMemCreate
and cudaMallocAsync
@vchuravy Is staging cuMemCreate/MallocAsync through cuMemAlloc/cudaMalloc memory not an option? Is it true that JuliaGPU/CUDA.jl#1053 strictly needs to use cuMemCreate/MallocAsync ? |
Two notes:
From the perspective of CUDA.jl we currently do not expose the different allocators to the user, the only option is whether the user configures the use of a memory pool managed by CUDA.jl or via Now we currently have the work around for users who want to use UCX or MPI to disable the use of From my perspective as a user of MPI or UCX I would like to see support for There seem to be two relevant pointer attributes:
|
This remains an issue https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/20?u=vchuravy and we have to tell users to explicitly disable CUDA mempool's support. |
See the "Interprocess communication support" section here: https://developer.nvidia.com/blog/using-cuda-stream-ordered-memory-allocator-part-2/ |
From the discussions I had with @Akshay-Venkatesh , it seems using an explicit pool handle for CUDA IPC may not be possible in UCX at the moment, but that will probably be possible in protov2. Meanwhile, support for |
Describe the bug
CUDA 10.2 introduced a new set of memory allocation routines (https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__VA.html#group__CUDA__VA) which allow for pooled allocation and stream based allocation.
These allocation do not support
cuIpcGetMemHandle
as noted in https://developer.nvidia.com/blog/introducing-low-level-gpu-virtual-memory-management/It seems to be that CUDA 11.2 introduced
cudaMallocAsync
is using this new interface under the hood as https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__POOLS.html#group__CUDART__MEMORY__POOLS_1g8158cc4b2c0d2c2c771f9d1af3cf386e takes a HandleType https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gabde707dfb8a602b917e0b177f77f365Steps to Reproduce
See JuliaGPU/CUDA.jl#1053 for an application failure caused by this.
The error encountered is:
The text was updated successfully, but these errors were encountered: