Skip to content

Added torchaudio installation to setup.py#1010

Draft
tzielinski-habana wants to merge 3 commits intovllm-project:mainfrom
tzielinski-habana:torchaudio
Draft

Added torchaudio installation to setup.py#1010
tzielinski-habana wants to merge 3 commits intovllm-project:mainfrom
tzielinski-habana:torchaudio

Conversation

@tzielinski-habana
Copy link
Copy Markdown
Collaborator

This pull request introduces changes to the setup.py installation logic, specifically addressing the handling of the torchaudio dependency to avoid inadvertently installing CUDA-enabled PyTorch when only CPU support is desired. The changes ensure a safer and more predictable installation process for users.

Dependency management improvements:

  • Excluded torchaudio from install_requires in setup.py, as its installation requires special handling to avoid pulling CUDA torch dependencies.
  • Added logic to install torchaudio separately using pip install --no-deps with the correct version matching the installed torch, and only if not running in metadata generation mode (dist_info or egg_info).

We need torchaudio because it's imported from upstream vllm as a result of this PR:
vllm-project/vllm#33247

Signed-off-by: tzielinski-habana <tomasz.zielinski@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts setup.py dependency handling to avoid pulling CUDA-enabled PyTorch when adding torchaudio, by removing torchaudio from install_requires and attempting to install it separately with pip --no-deps using a version derived from the installed torch.

Changes:

  • Filter torchaudio out of install_requires derived from requirements.txt.
  • Add post-setup() logic that imports torch, derives a torchaudio==x.y.z pin from torch.__version__, and runs pip install --no-deps against the PyTorch CPU wheel index.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread setup.py Outdated
Comment on lines +84 to +85
# Skipped during metadata generation (dist_info / egg_info).
if "dist_info" not in sys.argv and "egg_info" not in sys.argv:
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dist_info/egg_info argv guard is not sufficient to prevent this from running during packaging operations: wheel builds typically invoke bdist_wheel (and possibly sdist/build), so this block will still run while producing artifacts. If this logic remains, it should be gated so it only runs for the final install into the target environment (not during build/metadata/wheel creation).

Suggested change
# Skipped during metadata generation (dist_info / egg_info).
if "dist_info" not in sys.argv and "egg_info" not in sys.argv:
# Skipped during metadata generation and build/package creation.
if not any(cmd in sys.argv for cmd in ("dist_info", "egg_info", "bdist_wheel", "sdist", "build")):

Copilot uses AI. Check for mistakes.
Comment thread setup.py
"and add --no-build-isolation to pip install\n"
"********************************************************************************\n") from None
# Extract stable x.y.z from versions like 2.10.0a0+git...
ver = re.match(r"(\d+\.\d+\.\d+)", torch.__version__).group(1)
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re.match(...).group(1) will raise an AttributeError if torch.__version__ doesn’t match the expected x.y.z format (e.g., unusual local builds). This would fail the package install even if torchaudio is otherwise optional. Handle the no-match case explicitly (raise a clear error or fall back to a safer version parsing approach, e.g., via packaging.version).

Suggested change
ver = re.match(r"(\d+\.\d+\.\d+)", torch.__version__).group(1)
version_match = re.match(r"(\d+\.\d+\.\d+)", torch.__version__)
if version_match is None:
raise RuntimeError(
f"Unable to parse torch version from '{torch.__version__}' "
"to determine the matching torchaudio version."
)
ver = version_match.group(1)

Copilot uses AI. Check for mistakes.
Comment thread setup.py Outdated
Comment on lines +85 to +97
if "dist_info" not in sys.argv and "egg_info" not in sys.argv:
try:
import torch
except ImportError:
raise RuntimeError(
"torch is not importable - this is needed for torchaudio installation.\n\n"
"********************************************************************************\n"
"Make sure torch is installed before installing vllm-gaudi\n"
"and add --no-build-isolation to pip install\n"
"********************************************************************************\n") from None
# Extract stable x.y.z from versions like 2.10.0a0+git...
ver = re.match(r"(\d+\.\d+\.\d+)", torch.__version__).group(1)
subprocess.check_call([
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The torchaudio install runs unconditionally and will re-run on every invocation, potentially downgrading/upgrading an existing torchaudio install. Consider first checking whether torchaudio is already installed and whether its version matches the desired torch version before calling pip.

Copilot uses AI. Check for mistakes.
Comment thread setup.py Outdated
Comment on lines +47 to +50
# Exclude torchaudio from install_requires — it needs --no-deps to
# avoid pulling CUDA torch, which install_requires cannot express.
requirements = [r for r in requirements if not r.strip().startswith("torchaudio")]

Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering requirements via startswith("torchaudio") can accidentally exclude unrelated packages (e.g., torchaudio-foo) and won’t handle some valid requirement formats reliably. If torchaudio ever appears in requirements files, consider parsing each line as a PEP 508 requirement and filtering by normalized project name instead of a string prefix.

Copilot uses AI. Check for mistakes.
Comment thread setup.py
Comment on lines +97 to +106
subprocess.check_call([
sys.executable,
"-m",
"pip",
"install",
"--no-deps",
"--extra-index-url",
"https://download.pytorch.org/whl/cpu",
f"torchaudio=={ver}",
])
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running pip install via subprocess.check_call inside setup.py introduces build-time side effects (network access, mutating the build env) under the PEP 517 setuptools.build_meta backend. This will execute during wheel builds (e.g., bdist_wheel) and can break reproducible/offline builds and CI. Prefer moving this to an explicit install step (docs or a dedicated installer script/extra) rather than performing dependency installation from setup.py.

Suggested change
subprocess.check_call([
sys.executable,
"-m",
"pip",
"install",
"--no-deps",
"--extra-index-url",
"https://download.pytorch.org/whl/cpu",
f"torchaudio=={ver}",
])
# Building/installing this package should not perform network operations or
# mutate the environment (e.g., by running `pip install`). Instead, tell the
# user how to install a matching torchaudio version explicitly.
cmd = (
f"{sys.executable} -m pip install --no-deps --extra-index-url "
"https://download.pytorch.org/whl/cpu "
f"torchaudio=={ver}"
)
raise RuntimeError(
"torchaudio is required but is not installed.\n\n"
"To install a CPU-only torchaudio build matching your torch version, run:\n\n"
f" {cmd}\n\n"
"Note: This command must be run explicitly by the user; it is not executed\n"
"automatically during the build to keep builds reproducible and side-effect free."
)

Copilot uses AI. Check for mistakes.
Signed-off-by: tzielinski-habana <tomasz.zielinski@intel.com>
Signed-off-by: tzielinski-habana <tomasz.zielinski@intel.com>
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@tzielinski-habana tzielinski-habana marked this pull request as draft February 23, 2026 19:35
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
gezhen1024 added a commit to gezhen1024/vllm-gaudi that referenced this pull request Apr 9, 2026
- Add VLLM_WARMUP_TIMEOUT and VLLM_WARMUP_DEBUG environment variables
- Add detailed logging in warmup_graphs to track bucket processing
- Add timing information for each warmup bucket
- Add debug logging in _prepare_dummy_scenario and _execute_dummy_scenario
- Add diagnostic scripts for bucket analysis
- Add WARMUP_DEBUG_GUIDE.md with troubleshooting steps

This helps diagnose warmup hangs in large models like Qwen3.5-122B.

Signed-off-by: Gezhen <gezhen@company.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants