[Feature Request] Mapping User Buffers to NPU address space using RegisterBufferAttribute #23457
Labels
ep:QNN
issues related to QNN exeution provider
feature request
request for unsupported feature or enhancement
Describe the feature request
When analyzing performance on QNN EP we noticed that for each inference a memcpy is being made for the input buffer and the output buffer.
INPUT: ORT Buffer copied into RPC Buffer.
OUPUT: RPC Buffer copied into ORT Buffer.
Currently, the only way to avoid any copies is for the APP to use QNN-EP Allocator to allocate memory using RpcMemAlloc but that might not always be possible.
The ask is to avoid the copy between ORT Buffer and RPC Buffer by mapping the ORT Buffer into the NPU address space using RegisterBufferAttribute API.
Describe scenario use case
An application can malloc input and output buffers and then call CreateTensorWithDataAsOrtValue() on said buffers.
QNN-EP can then call RegisterBufferAttribute on these CPU buffers to map them to the NPU address space.
The text was updated successfully, but these errors were encountered: