Allocation functions, memory transfers and context #53

npmiller · 2022-11-25T14:37:23Z

We've been investigating changing the PI interface for memory allocations and also to some extent for memory transfers, which in turns also changes some of the meaning of the PI context. A lot of the reasoning for these changes is based on how the SYCL DPC++ runtime currently works, but it would be good to consider them for the Unified Runtime.

The changes are:

Add a pi_device argument to buffer and image allocation entry points (piMemBufferCreate, piMemImageCreate). It doesn't necessarily mean that the allocation will only be usable on that device, but it's helpful for backends that don't natively support context style allocations. For the DPC++ SYCL runtime this makes a lot of sense because we already do lazy allocation so when we call these functions we always already know the exact device targeted and not just the context (the SYCL context_bound property is not currently implemented in DPC++).
Add a new query piextGetMemoryConnection that takes two pairs of (pi_device, pi_context), and returns information on how the memory can or should be handled between the two pairs. It currently has three options:
- PI_MEMORY_CONNECTION_NONE: memory in the first (context, device) pair cannot be used or migrated by the plugin into the second (context, device) pair, copies through host are necessary.
- PI_MEMORY_CONNECTION_MIGRATABLE: memory in the first (context, device) pair cannot be used directly by the second (context, device) pair, but the plugin can handle migrating data between the two (piEnqueueMemBufferCopy).
- PI_MEMORY_CONNECTION_UNIFIED: memory in the first (context, device) pair is usable in the second pair.

And with these two changes it means that a backend that doesn't natively support context-style allocations doesn't have to emulate them anymore, and can simply allocate for a specific device and report that the memory still needs to be migrated between devices in the same context. And a device that does support context-style allocations can ignore the pi_device passed to the allocation functions and then simply report PI_MEMORY_CONNECTION_UNIFIED when the contexts are identical, and PI_MEMORY_CONNECTION_NONE when the contexts are different. In addition it also means that we can let plugins inform us if they can optimize memory copies between different context by reporting PI_MEMORY_CONNECTION_MIGRATABLE, which would mean that piEnqueueMemBufferCopy is supported between the two contexts and may be more efficient than doing a copy through host.

And so to circle back to the initial motivation, CUDA doesn't have context-style memory allocations like OpenCL or PI, and so to support having multiple CUDA devices in the same pi_context we would have to roll out our own memory manager in the CUDA plugin (which I believe the LevelZero plugin also does), but since the SYCL runtime already has a memory manager, these PI plugin changes allow us to simply defer the management of memory allocations within the same context for the CUDA plugin to the SYCL runtime.

You can see more discussions and initial implementations of this on the following PR:

[SYCL][CUDA][PI][runtime][ABI-break] Add support for multi-device context intel/llvm#6446

The text was updated successfully, but these errors were encountered:

kbenzie · 2022-11-25T14:52:05Z

Thanks @npmiller. I had scanned intel/llvm#6446 before, this is good additional context.

@pbalcer this is relevant to the work your team is doing.

pbalcer · 2022-11-25T14:58:21Z

Yes - definitely. I wasn't aware someone was already working on this, thanks!
@igchor @vinser52

kbenzie added the needs-discussion This needs further discussion label Nov 29, 2022

kbenzie mentioned this issue Nov 29, 2022

Consider interface to expose P2P capabilities #68

Closed

kbenzie added pi DPC++ PI requirement memory Memory allocations/transfers/operations labels Dec 5, 2022

npmiller mentioned this issue Jan 2, 2023

[SYCL][CUDA][PI][runtime][ABI-break] Add support for multi-device context intel/llvm#6446

Closed

kbenzie added the specification Changes or additions to the specification label Feb 9, 2023

kbenzie mentioned this issue Apr 11, 2023

[UR] Add PI changes to support multi-device contexts. #437

Closed

kbenzie removed the needs-discussion This needs further discussion label Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allocation functions, memory transfers and context #53

Allocation functions, memory transfers and context #53

npmiller commented Nov 25, 2022

kbenzie commented Nov 25, 2022

pbalcer commented Nov 25, 2022

Allocation functions, memory transfers and context #53

Allocation functions, memory transfers and context #53

Comments

npmiller commented Nov 25, 2022

kbenzie commented Nov 25, 2022

pbalcer commented Nov 25, 2022