Add UCX Plugin Bundling for NIXL install script#793
Add UCX Plugin Bundling for NIXL install script#793pi314ever wants to merge 7 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds UCX plugin bundling support to the NIXL installation script, enabling the inclusion of gaudi_gdr and other UCX plugins in the NIXL wheel package. The changes focus on extending the build process to bundle UCX plugins and making the UCX commit configurable.
Key Changes:
- Added a new step (4/4) to bundle UCX plugins into the repaired wheel using a helper script
- Made UCX commit configurable via command-line argument with a default commit hash
- Updated UCX build configuration to enable Gaudi support and examples
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Patch the helper script to skip NIXL plugins (since we only want UCX) | ||
| # This prevents it from failing when it can't find system NIXL plugins | ||
| sed_expr = 's/add_plugins(wheel_path, args.nixl_plugins_dir, "nixl")/# &/' | ||
| run_command(["sed", "-i", sed_expr, helper_script]) | ||
|
|
||
| print(f"--> Adding plugins from {ucx_plugins_src}", flush=True) | ||
| bundle_cmd = [ | ||
| sys.executable, | ||
| helper_script, |
There was a problem hiding this comment.
The sed command modifies a Python helper script by commenting out a line. This is fragile and could break if the helper script's code changes. Consider adding a command-line flag to the helper script to skip NIXL plugins instead of modifying the script with sed, or check if the helper script already supports such a flag.
| # Patch the helper script to skip NIXL plugins (since we only want UCX) | |
| # This prevents it from failing when it can't find system NIXL plugins | |
| sed_expr = 's/add_plugins(wheel_path, args.nixl_plugins_dir, "nixl")/# &/' | |
| run_command(["sed", "-i", sed_expr, helper_script]) | |
| print(f"--> Adding plugins from {ucx_plugins_src}", flush=True) | |
| bundle_cmd = [ | |
| sys.executable, | |
| helper_script, | |
| # Patch the helper script to skip NIXL plugins (since we only want UCX). | |
| # This prevents it from failing when it can't find system NIXL plugins. | |
| helper_to_use = helper_script | |
| try: | |
| with open(helper_script, "r", encoding="utf-8") as f: | |
| helper_source = f.read() | |
| target_call = 'add_plugins(wheel_path, args.nixl_plugins_dir, "nixl")' | |
| if target_call in helper_source: | |
| patched_source = helper_source.replace( | |
| target_call, | |
| f"# {target_call} # disabled by install_nixl.py: NIXL plugins not bundled", | |
| ) | |
| patched_helper = os.path.join( | |
| temp_wheel_dir, "wheel_add_ucx_plugins_patched.py" | |
| ) | |
| with open(patched_helper, "w", encoding="utf-8") as f: | |
| f.write(patched_source) | |
| helper_to_use = patched_helper | |
| else: | |
| print( | |
| "--> Warning: Could not find NIXL plugins call in helper script; " | |
| "running helper script unmodified.", | |
| flush=True, | |
| ) | |
| except OSError as e: | |
| print( | |
| f"--> Warning: Failed to prepare patched helper script ({e}); " | |
| "running helper script unmodified.", | |
| flush=True, | |
| ) | |
| print(f"--> Adding plugins from {ucx_plugins_src}", flush=True) | |
| bundle_cmd = [ | |
| sys.executable, | |
| helper_to_use, |
| UCX_INSTALL_DIR = os.path.join("/tmp", "ucx_install") | ||
| UCX_REPO_URL = "https://github.com/openucx/ucx.git" | ||
| NIXL_REPO_URL = "https://github.com/ai-dynamo/nixl.git" | ||
|
|
There was a problem hiding this comment.
The UCX commit hash lacks context explaining why this specific commit was chosen. Consider adding a comment explaining the significance of this commit (e.g., 'UCX commit with Gaudi GDR support' or a reference to the UCX version/tag it corresponds to).
| # Known-good UCX commit tested with the current NIXL version; update only after validation. |
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
2 similar comments
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
99148d6 to
3fd271c
Compare
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
✅ CI PassedAll checks passed successfully against the following vllm commit: |
|
Combined with #711 for complete integration with CI |
This adds support for ucx plugin bundling into
install_nixl.py, which is needed for gaudi_gdr and other ucx plugins in nixl.