Skip to content

Added nixl_ep installation option to Dockerfile#1077

Merged
ovidiusm merged 2 commits intoai-dynamo:mainfrom
dfarge:nixl_ep_dockerfile
Dec 9, 2025
Merged

Added nixl_ep installation option to Dockerfile#1077
ovidiusm merged 2 commits intoai-dynamo:mainfrom
dfarge:nixl_ep_dockerfile

Conversation

@dfarge
Copy link
Copy Markdown
Contributor

@dfarge dfarge commented Dec 3, 2025

What?

Added option to build nixl_ep (yet to be merged as of writing) by passing the flag --build-nixl-ep to the build-container shell file.

Why?

To ease use and development of the nixl_ep example.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Dec 3, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Dec 3, 2025

👋 Hi dfarge! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

@eranrs
Copy link
Copy Markdown
Contributor

eranrs commented Dec 7, 2025

I have no comments on this. @dfarge, make sure to test it again before merging after #1043 is approved and merged.

@ovidiusm
Copy link
Copy Markdown
Contributor

ovidiusm commented Dec 8, 2025

Dependency was merged: #1043

Please post this for review so that we can have it merged by code freeze (Friday December 12th). Thank you!

@michal-shalev michal-shalev marked this pull request as ready for review December 9, 2025 07:25
@michal-shalev michal-shalev requested a review from a team as a code owner December 9, 2025 07:25
@michal-shalev
Copy link
Copy Markdown
Contributor

/ok to test 3adcfeb

@michal-shalev
Copy link
Copy Markdown
Contributor

/ok to test 947eea9

@michal-shalev
Copy link
Copy Markdown
Contributor

/build

@dfarge
Copy link
Copy Markdown
Contributor Author

dfarge commented Dec 9, 2025

I made some adjustments as per the pathing changes for where nixl_ep is now located.
I built a container with cuda 12.9 with the new pathing and am sanity checking it. I'll push when done.

Additionally, I see an issue when trying to build the dockerfile with nixl_ep and cuda 13.0, @michal-shalev is it a known issue?
What I get is gpunetio isn't being built with cuda 13.0, and that fails the nixl_ep installation.

@dfarge dfarge force-pushed the nixl_ep_dockerfile branch 2 times, most recently from 6fb627e to 6be5ee8 Compare December 9, 2025 12:38
@dfarge
Copy link
Copy Markdown
Contributor Author

dfarge commented Dec 9, 2025

/ok to test 6be5ee8

@dfarge dfarge force-pushed the nixl_ep_dockerfile branch from 6be5ee8 to 1b7909c Compare December 9, 2025 15:34
@dfarge
Copy link
Copy Markdown
Contributor Author

dfarge commented Dec 9, 2025

Added a commit to fix the gpunetio's cuda 13.0 error we were getting.
Did a sanity check on EOS and elastic.py runs well on 1 and 2 nodes.

@ovidiusm
Copy link
Copy Markdown
Contributor

ovidiusm commented Dec 9, 2025

/build

@ovidiusm
Copy link
Copy Markdown
Contributor

ovidiusm commented Dec 9, 2025

/ok to test 1b7909c

@ovidiusm ovidiusm enabled auto-merge (squash) December 9, 2025 21:32
@ovidiusm ovidiusm merged commit 8da5986 into ai-dynamo:main Dec 9, 2025
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants