[Doc] Fix multimodal torch.compile troubleshooting to not use removed VLLM_TORCH_COMPILE_LEVEL by DaoyuanLi2816 · Pull Request #44378 · vllm-project/vllm

DaoyuanLi2816 · 2026-06-03T03:48:59Z

Purpose

The "Compilation Errors" troubleshooting section in docs/design/torch_compile_multimodal.md tells users to disable compilation like this:

VLLM_TORCH_COMPILE_LEVEL=0 vllm serve <model> --compilation-config='{"compile_mm_encoder":"false"}'

VLLM_TORCH_COMPILE_LEVEL no longer exists anywhere in the codebase — torch.compile control was moved to CompilationConfig / the -O optimization levels. The env var is silently ignored, so following this step does not actually disable compilation: the model still runs compiled, defeating the "verify the model works without compilation" instruction.

Change

Use the documented way to turn off torch.compile + CUDA graphs:

vllm serve <model> --enforce-eager --compilation-config='{"compile_mm_encoder":"false"}'

Per docs/design/debug_vllm_compile.md, --enforce-eager (→ enforce_eager=True) is "Turn off torch.compile and CUDAGraphs". I kept the explicit compile_mm_encoder: false to preserve the original intent of also skipping the MM encoder.

One-line doc change.

Not a duplicate

Per AGENTS.md duplicate-work checks (on main @ 53b88d1d):

grep -rn VLLM_TORCH_COMPILE_LEVEL docs/ vllm/   # only the doc line above; no code definition
gh issue list --repo vllm-project/vllm --state all --search "VLLM_TORCH_COMPILE_LEVEL"  # nothing about this doc
gh pr list    --repo vllm-project/vllm --state all --search "VLLM_TORCH_COMPILE_LEVEL"  # none

The only open PR touching torch_compile_multimodal.md historically is #30549 (WhisperEncoder), which does not modify this troubleshooting section.

Test plan

# Confirm the env var is dead (no consumers anywhere):
grep -rn "VLLM_TORCH_COMPILE_LEVEL" . --include='*.py'   # (excluding the doc) -> no matches

pre-commit run --files docs/design/torch_compile_multimodal.md

typos, markdownlint-cli2, and the other applicable hooks pass on the changed file. (update-dockerfile-graph errors with Executable /bin/bash not found — a Windows-host limitation unrelated to this doc; it runs on Linux CI.)

AI-assisted (Claude Code); reviewed end-to-end by the submitter.

… VLLM_TORCH_COMPILE_LEVEL The "Compilation Errors" troubleshooting step in the multimodal torch.compile design doc tells users to disable compilation with: VLLM_TORCH_COMPILE_LEVEL=0 vllm serve <model> ... `VLLM_TORCH_COMPILE_LEVEL` no longer exists anywhere in the codebase — torch.compile control was moved to `CompilationConfig` / the `-O` optimization levels, so this env var is silently ignored and the model still runs with compilation enabled (defeating the troubleshooting step). Use the documented way to disable torch.compile + CUDA graphs instead: `--enforce-eager` (see docs/design/debug_vllm_compile.md, where `--enforce-eager` maps to `enforce_eager=True` = "Turn off torch.compile and CUDAGraphs"). Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>

mergify · 2026-06-03T03:50:02Z

Documentation preview: https://vllm--44378.org.readthedocs.build/en/44378/

DaoyuanLi2816 · 2026-06-05T16:25:25Z

Hi @hmellor — could you take a look when you have a moment? This is a small doc-correctness fix: the multimodal torch.compile troubleshooting section tells users to run VLLM_TORCH_COMPILE_LEVEL=0 ..., but that env var no longer exists, so the documented "disable compilation" step silently does nothing. The PR swaps it for --enforce-eager (the documented way to turn off torch.compile + CUDA graphs).

It's been open ~3 days with no reviewer auto-assigned — looks like docs/design/*.md isn't under a CODEOWNERS rule. Thanks!

DaoyuanLi2816 · 2026-06-06T20:54:02Z

@DarkLight1337 — would you have a moment for this small doc fix? It's a 1-line swap: the multimodal torch.compile troubleshooting section tells users to run VLLM_TORCH_COMPILE_LEVEL=0 vllm serve ..., but that env var was removed, so the documented "disable compilation" step silently does nothing. The PR replaces it with --enforce-eager (the documented way).

Same shape as #44128 (which sfeng33 helped merge). CI is green except gates that need a ready label. Thanks!

DarkLight1337 · 2026-06-07T03:44:46Z

I think we should use the direct replacement --compilation-config.mode=0. It is slightly different from --enforce-eager as it doesn't disable CUDA graphs

@DarkLight1337

Per review feedback from @DarkLight1337: mode=0 (CompilationMode.NONE) is the faithful replacement for the removed VLLM_TORCH_COMPILE_LEVEL=0 — it disables torch.compile while keeping CUDA graphs, whereas --enforce-eager also disables CUDA graphs (a behavior change). Fold mode into the existing --compilation-config JSON. Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>

DaoyuanLi2816 · 2026-06-07T06:21:02Z

Thanks @DarkLight1337, updated. Since the command already passed --compilation-config='{"compile_mm_encoder":"false"}', I folded mode into the same JSON to keep it a single argument:

vllm serve <model> --compilation-config='{"mode":0,"compile_mm_encoder":"false"}'

As you noted, mode=0 (CompilationMode.NONE) is the faithful replacement for the removed VLLM_TORCH_COMPILE_LEVEL=0 since it leaves CUDA graphs intact, unlike --enforce-eager. Happy to switch to the --compilation-config.mode=0 dotted form instead if you prefer that style.

… VLLM_TORCH_COMPILE_LEVEL (vllm-project#44378) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>

… VLLM_TORCH_COMPILE_LEVEL (vllm-project#44378) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

… VLLM_TORCH_COMPILE_LEVEL (vllm-project#44378) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>

mergify Bot added the documentation Improvements or additions to documentation label Jun 3, 2026

DarkLight1337 approved these changes Jun 7, 2026

View reviewed changes

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 7, 2026

Merge branch 'main' into fix/doc-dead-torch-compile-level-env

edf3ef9

DarkLight1337 enabled auto-merge (squash) June 7, 2026 07:12

DarkLight1337 merged commit 15652a6 into vllm-project:main Jun 7, 2026
9 checks passed

DaoyuanLi2816 deleted the fix/doc-dead-torch-compile-level-env branch June 7, 2026 07:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Doc] Fix multimodal torch.compile troubleshooting to not use removed VLLM_TORCH_COMPILE_LEVEL#44378

[Doc] Fix multimodal torch.compile troubleshooting to not use removed VLLM_TORCH_COMPILE_LEVEL#44378
DarkLight1337 merged 3 commits into
vllm-project:mainfrom
DaoyuanLi2816:fix/doc-dead-torch-compile-level-env

DaoyuanLi2816 commented Jun 3, 2026

Uh oh!

mergify Bot commented Jun 3, 2026

Uh oh!

DaoyuanLi2816 commented Jun 5, 2026

Uh oh!

DaoyuanLi2816 commented Jun 6, 2026

Uh oh!

DarkLight1337 commented Jun 7, 2026 •

edited

Loading

Uh oh!

DaoyuanLi2816 commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

DaoyuanLi2816 commented Jun 3, 2026

Purpose

Change

Not a duplicate

Test plan

Uh oh!

mergify Bot commented Jun 3, 2026

Uh oh!

DaoyuanLi2816 commented Jun 5, 2026

Uh oh!

DaoyuanLi2816 commented Jun 6, 2026

Uh oh!

DarkLight1337 commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DaoyuanLi2816 commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DarkLight1337 commented Jun 7, 2026 •

edited

Loading