-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose Low-Level Memory Management API (with Virtual Memory reservation) #357
Comments
Is this a gap for the current PI or a nice-to-have? In general, I agree that this makes sense, but would require us to add a substantial amount of new APIs to manage pages. As Igor said, we are already working on those as part of UMA, but that won't be ready for the first release. So, if this isn't necessary for current work, my preference would be to wait until we are ready with the UMA abstractions and expose this functionality through that layer directly (which would probably be the next release). |
Thanks @pbalcer . That's a good approach. |
Another use case for this would be to support a deferred allocation scheme in SYCL. For example, the Command Graph Proposal (intel/llvm#5626) could benefit from such a feature. Because the user expresses the command graph first, not doing the allocation right away but only reserving virtual memory would create optimization opportunities. |
I agree with @pbalcer, but we can use this issue to track and capture related use cases. |
Discussion in the WG call noted that a SYCL extension is being designed by @steffenlarsen to support this use case which will introduce changes to PI, UR will also need to incorporate these. |
From the CUDA link above: "For now, the only supported type of memory is pinned device memory on the current device but there are more properties to come in future CUDA releases." Do I understand correctly that this means that it's not possible (at this moment) to use this API for shared/host allocations? @jandres742 is the same true for Level0? |
Correct @igchor . Same applies to L0. In L0 the feature is actually called "Reserved Device Allocations". |
@jandres742 Thanks for clarification. Is there any plan to extend this API for host/shared memory? Is this even possible? I'm wondering what should we support on the UR level. |
at least from L0 point of view, I dont see why we couldn't support reserved host and shared allocations. If we added host and shared support in the UR, likely it wouldn't be used at this moment, but it would be ready for whenever that support is available in the future. A way of doing this is just to define it in the UR generically for "USM allocations", and to have a parameter that indicates the type of allocation being targeted. In CUDA we even have that already. The flags are reserved for future use, so we could have here HOST, DEVICE, SHARED. cuMemCreate ( CUmemGenericAllocationHandle* handle, size_t size, const CUmemAllocationProp* prop, unsigned long long flags )
Create a CUDA memory handle representing a memory allocation of a given size described by the given properties.
Parameters
handle
- Value of handle returned. All operations on this allocation are to be performed using this handle.
size
- Size of the allocation requested
prop
- Properties of the allocation to create.
flags
- flags for future use, must be zero now. |
@igchor @jandres742 what is the reason to support it for Host and Shared allocations? For Device it is clear - we need memory coloring to manually distribute physical pages among tiles. |
@vinser52 that is one use-case, right. But the way I see these interfaces is to allow finer control of memory management to layers above. With these interfaces, a specific VA can be reserved, memory can be allocated/deallocated while still keeping the VA, map/unmap as needed, etc. All that applies to any type of memory as I see it. Definitely the main use is device (which is why it is currently defined and supported there), but I can see usage in host and shared (and I have received same requests from workload owners). But definitely, if only device support is added, it would cover more basis. |
@jandres742 Yeah, I am asking because I want to make sure we are use-case driven. Probably, it makes sense to postpone adding such low-level API until we have a strong justification/use-case. In general, we (@pbalcer, @igchor and I) have been discussing the support of VA reservation and physical page allocation/deallocation in UMA and I hope we will introduce it. But I am not sure that we need to expose it in UR API (at least now). My point is that such low-level API is required only when the underlying infrastructure cannot do implicitly what the user wants. E.g. on CPU OS do implicit page placement/migration. |
Agree this is not an urgent topic to address. And yes, you are correct on the CPU support available. |
SYCL extension being proposed in https://github.com/intel/llvm/pull/8954/files |
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces
Fixes oneapi-src#357 by introducing a port of the PI changes from intel/llvm#8954 which add the ability to reserve virtual memory regions separately from physical memory backing allocations. * [x] Define interfaces in the spec & header * [ ] Add conformance tests exercising interfaces - tracked in oneapi-src#525
Rationale (from L0 doc):
If an application needs finer grained control of physical memory consumption for device allocations then it can reserve a range of the virtual address space and map this to physical memory as needed. This provides flexibility for applications to manage large dynamic data structures which can grow and shrink over time while maintaining optimal physical memory usage.
Such API is present in CUDA and level-zero:
https://developer.nvidia.com/blog/introducing-low-level-gpu-virtual-memory-management/
https://spec.oneapi.io/level-zero/latest/core/PROG.html#reserved-device-allocations
One option is to implement API similar to the one in CUDA/level-zero. Alternatively, we could also leverage UMA memory provider abstraction. Memory provider is expected to support virtual memory reservations and physical allocations anyway.
The text was updated successfully, but these errors were encountered: