Skip to content

[Doc] Fix multimodal torch.compile troubleshooting to not use removed VLLM_TORCH_COMPILE_LEVEL#44378

Merged
DarkLight1337 merged 3 commits into
vllm-project:mainfrom
DaoyuanLi2816:fix/doc-dead-torch-compile-level-env
Jun 7, 2026
Merged

[Doc] Fix multimodal torch.compile troubleshooting to not use removed VLLM_TORCH_COMPILE_LEVEL#44378
DarkLight1337 merged 3 commits into
vllm-project:mainfrom
DaoyuanLi2816:fix/doc-dead-torch-compile-level-env

Conversation

@DaoyuanLi2816

Copy link
Copy Markdown
Contributor

Purpose

The "Compilation Errors" troubleshooting section in docs/design/torch_compile_multimodal.md tells users to disable compilation like this:

VLLM_TORCH_COMPILE_LEVEL=0 vllm serve <model> --compilation-config='{"compile_mm_encoder":"false"}'

VLLM_TORCH_COMPILE_LEVEL no longer exists anywhere in the codebase — torch.compile control was moved to CompilationConfig / the -O optimization levels. The env var is silently ignored, so following this step does not actually disable compilation: the model still runs compiled, defeating the "verify the model works without compilation" instruction.

Change

Use the documented way to turn off torch.compile + CUDA graphs:

vllm serve <model> --enforce-eager --compilation-config='{"compile_mm_encoder":"false"}'

Per docs/design/debug_vllm_compile.md, --enforce-eager (→ enforce_eager=True) is "Turn off torch.compile and CUDAGraphs". I kept the explicit compile_mm_encoder: false to preserve the original intent of also skipping the MM encoder.

One-line doc change.

Not a duplicate

Per AGENTS.md duplicate-work checks (on main @ 53b88d1d):

grep -rn VLLM_TORCH_COMPILE_LEVEL docs/ vllm/   # only the doc line above; no code definition
gh issue list --repo vllm-project/vllm --state all --search "VLLM_TORCH_COMPILE_LEVEL"  # nothing about this doc
gh pr list    --repo vllm-project/vllm --state all --search "VLLM_TORCH_COMPILE_LEVEL"  # none

The only open PR touching torch_compile_multimodal.md historically is #30549 (WhisperEncoder), which does not modify this troubleshooting section.

Test plan

# Confirm the env var is dead (no consumers anywhere):
grep -rn "VLLM_TORCH_COMPILE_LEVEL" . --include='*.py'   # (excluding the doc) -> no matches

pre-commit run --files docs/design/torch_compile_multimodal.md

typos, markdownlint-cli2, and the other applicable hooks pass on the changed file. (update-dockerfile-graph errors with Executable /bin/bash not found — a Windows-host limitation unrelated to this doc; it runs on Linux CI.)


AI-assisted (Claude Code); reviewed end-to-end by the submitter.

… VLLM_TORCH_COMPILE_LEVEL

The "Compilation Errors" troubleshooting step in the multimodal
torch.compile design doc tells users to disable compilation with:

    VLLM_TORCH_COMPILE_LEVEL=0 vllm serve <model> ...

`VLLM_TORCH_COMPILE_LEVEL` no longer exists anywhere in the codebase —
torch.compile control was moved to `CompilationConfig` / the `-O`
optimization levels, so this env var is silently ignored and the model
still runs with compilation enabled (defeating the troubleshooting step).

Use the documented way to disable torch.compile + CUDA graphs instead:
`--enforce-eager` (see docs/design/debug_vllm_compile.md, where
`--enforce-eager` maps to `enforce_eager=True` = "Turn off torch.compile
and CUDAGraphs").

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
@mergify

mergify Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Documentation preview: https://vllm--44378.org.readthedocs.build/en/44378/

@mergify mergify Bot added the documentation Improvements or additions to documentation label Jun 3, 2026
@DaoyuanLi2816

Copy link
Copy Markdown
Contributor Author

Hi @hmellor — could you take a look when you have a moment? This is a small doc-correctness fix: the multimodal torch.compile troubleshooting section tells users to run VLLM_TORCH_COMPILE_LEVEL=0 ..., but that env var no longer exists, so the documented "disable compilation" step silently does nothing. The PR swaps it for --enforce-eager (the documented way to turn off torch.compile + CUDA graphs).

It's been open ~3 days with no reviewer auto-assigned — looks like docs/design/*.md isn't under a CODEOWNERS rule. Thanks!

@DaoyuanLi2816

Copy link
Copy Markdown
Contributor Author

@DarkLight1337 — would you have a moment for this small doc fix? It's a 1-line swap: the multimodal torch.compile troubleshooting section tells users to run VLLM_TORCH_COMPILE_LEVEL=0 vllm serve ..., but that env var was removed, so the documented "disable compilation" step silently does nothing. The PR replaces it with --enforce-eager (the documented way).

Same shape as #44128 (which sfeng33 helped merge). CI is green except gates that need a ready label. Thanks!

@DarkLight1337

DarkLight1337 commented Jun 7, 2026

Copy link
Copy Markdown
Member

I think we should use the direct replacement --compilation-config.mode=0. It is slightly different from --enforce-eager as it doesn't disable CUDA graphs

Per review feedback from @DarkLight1337: mode=0 (CompilationMode.NONE) is
the faithful replacement for the removed VLLM_TORCH_COMPILE_LEVEL=0 — it
disables torch.compile while keeping CUDA graphs, whereas --enforce-eager
also disables CUDA graphs (a behavior change). Fold mode into the existing
--compilation-config JSON.

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
@DaoyuanLi2816

Copy link
Copy Markdown
Contributor Author

Thanks @DarkLight1337, updated. Since the command already passed --compilation-config='{"compile_mm_encoder":"false"}', I folded mode into the same JSON to keep it a single argument:

vllm serve <model> --compilation-config='{"mode":0,"compile_mm_encoder":"false"}'

As you noted, mode=0 (CompilationMode.NONE) is the faithful replacement for the removed VLLM_TORCH_COMPILE_LEVEL=0 since it leaves CUDA graphs intact, unlike --enforce-eager. Happy to switch to the --compilation-config.mode=0 dotted form instead if you prefer that style.

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 7, 2026
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) June 7, 2026 07:12
@DarkLight1337 DarkLight1337 merged commit 15652a6 into vllm-project:main Jun 7, 2026
9 checks passed
@DaoyuanLi2816 DaoyuanLi2816 deleted the fix/doc-dead-torch-compile-level-env branch June 7, 2026 07:37
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
… VLLM_TORCH_COMPILE_LEVEL (vllm-project#44378)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
ekagra-ranjan pushed a commit to ekagra-ranjan/vllm that referenced this pull request Jun 9, 2026
… VLLM_TORCH_COMPILE_LEVEL (vllm-project#44378)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
waqahmed-amd-fi pushed a commit to waqahmed-amd-fi/vllm that referenced this pull request Jun 10, 2026
… VLLM_TORCH_COMPILE_LEVEL (vllm-project#44378)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants