Skip to content

Conversation

@e-ago
Copy link
Contributor

@e-ago e-ago commented Oct 29, 2025

GPUNetIO plugin has a persistent kernel running on a dedicated stream.
NIXL Bench uses the sync memory op cudaFree to free VRAM buffers but the cudaFree requires to synchronize the whole GPU device before proceeding.

Due to the gpunetIo persistent kernel, nixlbench executed with gpunetio plugin hangs on the cudaFree.

This PR resolves this issue replacing cudaFree with cudaFreeAsync

@e-ago e-ago requested review from a team, aranadive, brminich and ovidiusm as code owners October 29, 2025 11:48
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 29, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

👋 Hi e-ago! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

@e-ago
Copy link
Contributor Author

e-ago commented Oct 29, 2025

/build

@e-ago e-ago requested a review from a team as a code owner October 29, 2025 13:48
@pull-request-size pull-request-size bot added size/M and removed size/S labels Oct 29, 2025
@e-ago
Copy link
Contributor Author

e-ago commented Oct 29, 2025

/build

@e-ago e-ago force-pushed the fix_gpunetio_bench branch from 1caba49 to 25503bd Compare October 30, 2025 12:26
@pull-request-size pull-request-size bot added size/S and removed size/M labels Oct 30, 2025
@e-ago e-ago force-pushed the fix_gpunetio_bench branch from 25503bd to f9ca79d Compare October 30, 2025 12:28
@e-ago e-ago changed the title nixlbench: fix cudaFree problem, enable GPUNetIO in CI nixlbench: fix cudaFree problem Oct 30, 2025
Signed-off-by: eagostini <[email protected]>
@ovidiusm ovidiusm requested a review from rakhmets October 30, 2025 17:02
@e-ago
Copy link
Contributor Author

e-ago commented Nov 3, 2025

/build

@aranadive
Copy link
Contributor

/build

@aranadive
Copy link
Contributor

/ok to test 268bf6a

@aranadive aranadive merged commit 38e2585 into ai-dynamo:main Nov 4, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants