Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation functions, memory transfers and context #53

Open
npmiller opened this issue Nov 25, 2022 · 2 comments
Open

Allocation functions, memory transfers and context #53

npmiller opened this issue Nov 25, 2022 · 2 comments
Labels
memory Memory allocations/transfers/operations pi DPC++ PI requirement specification Changes or additions to the specification

Comments

@npmiller
Copy link
Contributor

We've been investigating changing the PI interface for memory allocations and also to some extent for memory transfers, which in turns also changes some of the meaning of the PI context. A lot of the reasoning for these changes is based on how the SYCL DPC++ runtime currently works, but it would be good to consider them for the Unified Runtime.

The changes are:

  1. Add a pi_device argument to buffer and image allocation entry points (piMemBufferCreate, piMemImageCreate). It doesn't necessarily mean that the allocation will only be usable on that device, but it's helpful for backends that don't natively support context style allocations. For the DPC++ SYCL runtime this makes a lot of sense because we already do lazy allocation so when we call these functions we always already know the exact device targeted and not just the context (the SYCL context_bound property is not currently implemented in DPC++).
  2. Add a new query piextGetMemoryConnection that takes two pairs of (pi_device, pi_context), and returns information on how the memory can or should be handled between the two pairs. It currently has three options:
    • PI_MEMORY_CONNECTION_NONE: memory in the first (context, device) pair cannot be used or migrated by the plugin into the second (context, device) pair, copies through host are necessary.
    • PI_MEMORY_CONNECTION_MIGRATABLE: memory in the first (context, device) pair cannot be used directly by the second (context, device) pair, but the plugin can handle migrating data between the two (piEnqueueMemBufferCopy).
    • PI_MEMORY_CONNECTION_UNIFIED: memory in the first (context, device) pair is usable in the second pair.

And with these two changes it means that a backend that doesn't natively support context-style allocations doesn't have to emulate them anymore, and can simply allocate for a specific device and report that the memory still needs to be migrated between devices in the same context. And a device that does support context-style allocations can ignore the pi_device passed to the allocation functions and then simply report PI_MEMORY_CONNECTION_UNIFIED when the contexts are identical, and PI_MEMORY_CONNECTION_NONE when the contexts are different. In addition it also means that we can let plugins inform us if they can optimize memory copies between different context by reporting PI_MEMORY_CONNECTION_MIGRATABLE, which would mean that piEnqueueMemBufferCopy is supported between the two contexts and may be more efficient than doing a copy through host.

And so to circle back to the initial motivation, CUDA doesn't have context-style memory allocations like OpenCL or PI, and so to support having multiple CUDA devices in the same pi_context we would have to roll out our own memory manager in the CUDA plugin (which I believe the LevelZero plugin also does), but since the SYCL runtime already has a memory manager, these PI plugin changes allow us to simply defer the management of memory allocations within the same context for the CUDA plugin to the SYCL runtime.

You can see more discussions and initial implementations of this on the following PR:

@kbenzie
Copy link
Contributor

kbenzie commented Nov 25, 2022

Thanks @npmiller. I had scanned intel/llvm#6446 before, this is good additional context.

@pbalcer this is relevant to the work your team is doing.

@pbalcer
Copy link
Contributor

pbalcer commented Nov 25, 2022

Yes - definitely. I wasn't aware someone was already working on this, thanks!
@igchor @vinser52

@kbenzie kbenzie added the needs-discussion This needs further discussion label Nov 29, 2022
@kbenzie kbenzie added pi DPC++ PI requirement memory Memory allocations/transfers/operations labels Dec 5, 2022
@kbenzie kbenzie added the specification Changes or additions to the specification label Feb 9, 2023
@kbenzie kbenzie removed the needs-discussion This needs further discussion label Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory Memory allocations/transfers/operations pi DPC++ PI requirement specification Changes or additions to the specification
Projects
None yet
Development

No branches or pull requests

3 participants