Skip to content

[Misc] Remove dead VLLM_RPC_TIMEOUT env var and fix profiling doc that references it#44128

Merged
sfeng33 merged 4 commits into
vllm-project:mainfrom
DaoyuanLi2816:fix/remove-dead-vllm-rpc-timeout
Jun 3, 2026
Merged

[Misc] Remove dead VLLM_RPC_TIMEOUT env var and fix profiling doc that references it#44128
sfeng33 merged 4 commits into
vllm-project:mainfrom
DaoyuanLi2816:fix/remove-dead-vllm-rpc-timeout

Conversation

@DaoyuanLi2816

Copy link
Copy Markdown
Contributor

Purpose

VLLM_RPC_TIMEOUT is documented in vllm/envs.py ("Time in ms for the zmq client to wait for a response from the backend server for simple data operations", default 10000), but it has no consumers anywhere in the tree — it is a V0 leftover.

In V1, the engine-core client waits on utility RPCs without any timeout:

  • SyncMPClient.call_utilityfuture.result() (no timeout arg)
  • AsyncMPClient._call_utility_asyncawait future (no timeout)

(both in vllm/v1/engine/core_client.py). So there is no client-side RPC timeout for this env var to control.

The problem this surfaces for users: the profiling guide actively tells people to set it —

docs/contributing/profiling.md
To stop the profiler ... Set the env variable VLLM_RPC_TIMEOUT to a big number before you start the server. export VLLM_RPC_TIMEOUT=1800000

Following that instruction has no effect, since nothing reads the variable. (In V1 the profiler-stop flush is awaited without a timeout anyway, so the original failure mode the tip was guarding against no longer exists.)

Changes

  • docs/contributing/profiling.md: replace the obsolete VLLM_RPC_TIMEOUT instruction with an accurate note — the engine client waits for the trace flush to complete without timing out.
  • vllm/envs.py: remove the orphaned VLLM_RPC_TIMEOUT entry (type stub + environment_variables lambda).

Net diff is +1 / -6.

Not a duplicate

Per AGENTS.md duplicate-work checks:

gh issue list --repo vllm-project/vllm --state all --search "VLLM_RPC_TIMEOUT"   # no issue about it being dead
gh pr list    --repo vllm-project/vllm --state all --search "VLLM_RPC_TIMEOUT"   # no PR touching it
gh pr list    --repo vllm-project/vllm --state open --search "profiling.md in:title"  # none

No existing issue or PR addresses the dead env var or the obsolete profiling instruction.

Test plan

Verified there are zero consumers before removing (static and dynamic access), on main @ 6bdabbad:

# whole tree, excluding the definition file — no matches:
grep -rn "VLLM_RPC_TIMEOUT" . --include='*.py' --include='*.sh' \
  --include='Dockerfile*' --include='*.cmake' --include='CMakeLists.txt' \
  --include='*.yaml' | grep -v vllm/envs.py

pre-commit on the changed files (Windows host, .venv Python 3.12.13):

pre-commit run --files vllm/envs.py docs/contributing/profiling.md

ruff check, ruff format, typos, markdownlint-cli2, mypy, Check SPDX headers, Validate configuration has default values and that each field has a docstring, and the remaining hooks pass on the changed files. (update-dockerfile-graph errors with Executable /bin/bash not found — a Windows-host limitation unrelated to these files; it runs on Linux CI.)

Risk

Removing the env var is safe: nothing reads it, so no runtime path changes. A user who still exports VLLM_RPC_TIMEOUT in their shell is simply left with an unused variable (same as today, since it was already ignored). If maintainers would instead prefer to restore a client-side RPC timeout in V1 and wire this variable back in, happy to take that direction instead.


AI-assisted (Claude Code); reviewed end-to-end by the submitter.

`VLLM_RPC_TIMEOUT` (default 10000ms, "Time in ms for the zmq client to
wait for a response from the backend server for simple data operations")
is a V0 leftover with no consumers anywhere in the tree. In V1, the
engine-core client waits on utility RPCs via `Future.result()` (sync) and
`await future` (async) in `vllm/v1/engine/core_client.py` with no timeout
argument, so there is no client-side RPC timeout for the env var to
control.

The profiling guide still instructs users to set it:

> Set the env variable VLLM_RPC_TIMEOUT to a big number before you start
> the server. `export VLLM_RPC_TIMEOUT=1800000`

Following that advice has no effect. Remove the obsolete instruction and
replace it with an accurate note (the engine client waits for the trace
flush to complete without timing out), and drop the dead env var from
`envs.py`.

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
@mergify

mergify Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Documentation preview: https://vllm--44128.org.readthedocs.build/en/44128/

@mergify mergify Bot added the documentation Improvements or additions to documentation label May 31, 2026

@njhill njhill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 1, 2026
@DaoyuanLi2816

Copy link
Copy Markdown
Contributor Author

@njhill friendly nudge — this has been ready + MERGEABLE/CLEAN for ~24h. Happy to address anything if needed.

@DaoyuanLi2816

Copy link
Copy Markdown
Contributor Author

Looping in @sfeng33 — this is the same dead-code-cleanup shape as #44279.

Approved + ready since 2026-06-01T02:30Z (~40h), CI green, MERGEABLE/CLEAN. Happy to address any further changes.

@sfeng33 sfeng33 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that there are still benchmark configs that set VLLM_RPC_TIMEOUT, e.g serving-tests-cpu.json, would be great to clean those up as well.

Per @sfeng33's review, these CPU benchmark configs set
VLLM_RPC_TIMEOUT=100000 but the env var has no consumers, so the
setting was already a no-op. Drop it everywhere it appears.

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
@DaoyuanLi2816

Copy link
Copy Markdown
Contributor Author

Good catch — thanks @sfeng33! Pushed 36f99a3 removing VLLM_RPC_TIMEOUT from all 9 CPU benchmark configs in .buildkite/performance-benchmarks/tests/ (serving / latency / throughput, x86 and arm64). Each is just a single-line deletion since the value was already a no-op.

@sfeng33 sfeng33 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@sfeng33 sfeng33 enabled auto-merge (squash) June 2, 2026 20:42
@mergify mergify Bot added ci/build performance Performance-related issues cpu Related to CPU backends labels Jun 2, 2026
@sfeng33 sfeng33 merged commit bd98e97 into vllm-project:main Jun 3, 2026
58 of 61 checks passed
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…t references it (vllm-project#44128)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
andakai pushed a commit to andakai/vllm that referenced this pull request Jun 4, 2026
…t references it (vllm-project#44128)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
JisoLya pushed a commit to JisoLya/vllm that referenced this pull request Jun 5, 2026
…t references it (vllm-project#44128)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Signed-off-by: JisoLya <523420504@qq.com>
vrdn-23 added a commit to vrdn-23/vllm that referenced this pull request Jun 5, 2026
Resolves the recurring envs.py merge conflict per
docs/superpowers/specs/2026-05-14-envs-merge-conflict-resolution-design.md.

The legacy `if TYPE_CHECKING:` block and `environment_variables: dict[str,
Callable]` runtime mapping were dropped on the branch in favor of pydantic
`*Settings(BaseSettings)` subclasses. Every main-side edit to either
location therefore conflicts mechanically; structural resolution is
`--ours` for vllm/envs.py, then port the semantic delta as new `Field(...)`
declarations on the appropriate sub-model.

Main-side commits since merge base afcb580, with port disposition:

- c73b0d0 (vllm-project#44669) — adds VLLM_RAY_DP_PLACEMENT_NODE_IPS (str=""). Ported
  to DistributedSettings.ray_dp_placement_node_ips.
- 165b786 (vllm-project#40426) — adds VLLM_ROCM_USE_AITER_LINEAR_HIPBMM (bool=False).
  Ported to RocmSettings.rocm_use_aiter_linear_hipbmm. Native pydantic bool
  parsing replaces the `.lower() in ("true","1")` lambda.
- 38fd240 (vllm-project#41980) — adds VLLM_DISTRIBUTED_USE_SPLIT_GROUP (bool=False).
  Ported to DistributedSettings.distributed_use_split_group. Native
  pydantic bool parsing replaces the `bool(int(...))` lambda.
- a618356 (vllm-project#43447) — adds VLLM_PREFIX_CACHE_RETENTION_INTERVAL
  (int|None=None, tri-state). Ported to
  ServerSettings.prefix_cache_retention_interval; pydantic's
  unset-vs-explicit-zero handling matches the original
  `"X" in os.environ` guard.
- bd98e97 (vllm-project#44128) — removes dead VLLM_RPC_TIMEOUT. Mirrored on the
  branch by deleting ServerSettings.rpc_timeout.

Verification: vllm.envs imports cleanly; all four new vars read defaults
and parse env-set values (incl. tri-state INTERVAL=0); VLLM_RPC_TIMEOUT
correctly raises AttributeError; pre-commit passes ruff/format/mypy.

Signed-off-by: Vinay Damodaran <vrdn@hey.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
…t references it (vllm-project#44128)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
waqahmed-amd-fi pushed a commit to waqahmed-amd-fi/vllm that referenced this pull request Jun 10, 2026
…t references it (vllm-project#44128)

Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build cpu Related to CPU backends documentation Improvements or additions to documentation performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants