-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Support plugging in custom user-defined allocators for sharing between sessions #8059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@hariharans29 Thanks CC @alonre24 |
I will document it after this is checked-in, but right now I am thinking, we just need to turn this on: The other configs that you see is used to define the behavior of ORT's internal arena which doesn't apply in your case as you actually control the behavior of your custom allocator through its implementation. Does that make sense ? |
|
Is there some consolidation we could/should do with the custom CUDA allocator support @codemzs added in #6745? FWIW, that PR has a completely different set of limitations: it does allow different allocators on a session-by-session basis (since it's an EP-level property) but it only works for the CUDA EP, and not any other EPs. However, I don't think we're relying on being able to set different allocators for different sessions, if that's hard to support. |
|
Hey @hariharans29, |
We are discussing it. Even if that is done, it may come as a separate change. |
We do not want to expose this "mix and match" capability for an external device allocator to use the BFCArena. BFCArena is an internal component and its implementation details are bound to change without warning and we do not want to keep maintaining compatibility layers. That being said since the source code of BFCArena is available, it should be fairly easy for you to re-use that in your allocator implementation and even "adapt it" for your use case if necessary. |
| * \param shape Shape of the tensor | ||
| * \param p_data A preallocated buffer. Can be NULL if the shape is empty. | ||
| * Tensor does not own the data and will not delete it | ||
| * Tensor will own the memory and will delete it when the tensor instance is destructed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment was wrong - fixing it
| p_data = static_cast<IArenaAllocator*>(alloc.get())->Reserve(mem_size); | ||
| else | ||
| p_data = alloc->Alloc(mem_size); | ||
| p_data = alloc->Reserve(mem_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just call Reserve() now as it is part of the IAllocator interface (and calls Alloc() by default). BFCArena provides a specialized override for Reserve()
Description:
This change tries to support the use-case where-in an external user wants to plug in their CPU device allocator implementation for sharing across all sessions to work with ORT-internal code.
This change also cleans up some related allocator internals in ORT - the most important refactoring being removal of the confusing
IArenaAllocatorinterface which in theory is not needed at all. All allocators can simply implementIAllocatorwhilst hosting arena logic in them in their allocation code path.Known limitations:
Can't plug in a different "custom" allocators for individual sessions yet (i.e.) user can't choose to plug-in Allocator_1 to be used only for Session_1 and Allocator_2 for Session_2 and so on. (Not sure if there is a use-case for this yet). The user can still use ORT's allocator and their custom allocator in tandem across sessions though because using the custom allocator for a session is controlled by using a session option.
Doesn't support non-CPU device allocators to be plugged-in yet.
Motivation and Context
#6689
#6143