Skip to content

Conversation

@michal-shalev
Copy link
Contributor

THIS PR IS PENDING openucx/ucx#10945

What?

Replace two-step GPU transfer request creation with a single createGpuXferReq API that takes descriptor lists directly and returns both transfer request and GPU request handles.

Why?

Simplify the API by reducing GPU transfer request creation from two separate function calls to a single call.

How?

Updated createGpuXferReq to substitute calls to createXferReq + createGpuXferReq. Removed the old createGpuXferReq method and updated tests accordingly.

@github-actions
Copy link

👋 Hi michal-shalev! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

@michal-shalev michal-shalev changed the title DEVICE/API: Simplify createGpuXferReq to single-step API DEVICE/API: Simplify createGpuXferReq to single-step API - WIP Oct 12, 2025
nixl_status_t
createGpuXferReq(const nixlXferReqH &req_hndl, nixlGpuXferReqH &gpu_req_hndl) const;
createGpuXferReq(const nixl_xfer_dlist_t &local_descs,
const nixl_xfer_dlist_t &remote_descs,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's confusing to have local_descs and remote_descs with different lengths, which implies that the last len(remote_descs) - len(local_descs) remote descs are inline only.

Instead, I think we should enforce len(local_descs) == len(remote_descs) and introduce a third parameter- inline_descs which will describe remote memory that has no corresponding src buffer (and therefore can only be written into with inline data)

Signed-off-by: Michal Shalev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants