[Misc] Remove dead VLLM_RPC_TIMEOUT env var and fix profiling doc that references it#44128
Merged
sfeng33 merged 4 commits intoJun 3, 2026
Merged
Conversation
`VLLM_RPC_TIMEOUT` (default 10000ms, "Time in ms for the zmq client to wait for a response from the backend server for simple data operations") is a V0 leftover with no consumers anywhere in the tree. In V1, the engine-core client waits on utility RPCs via `Future.result()` (sync) and `await future` (async) in `vllm/v1/engine/core_client.py` with no timeout argument, so there is no client-side RPC timeout for the env var to control. The profiling guide still instructs users to set it: > Set the env variable VLLM_RPC_TIMEOUT to a big number before you start > the server. `export VLLM_RPC_TIMEOUT=1800000` Following that advice has no effect. Remove the obsolete instruction and replace it with an accurate note (the engine client waits for the trace flush to complete without timing out), and drop the dead env var from `envs.py`. Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Contributor
|
Documentation preview: https://vllm--44128.org.readthedocs.build/en/44128/ |
Contributor
Author
|
@njhill friendly nudge — this has been |
Contributor
Author
sfeng33
reviewed
Jun 2, 2026
sfeng33
left a comment
Collaborator
There was a problem hiding this comment.
Seems that there are still benchmark configs that set VLLM_RPC_TIMEOUT, e.g serving-tests-cpu.json, would be great to clean those up as well.
Per @sfeng33's review, these CPU benchmark configs set VLLM_RPC_TIMEOUT=100000 but the env var has no consumers, so the setting was already a no-op. Drop it everywhere it appears. Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
Contributor
Author
|
Good catch — thanks @sfeng33! Pushed |
mvanhorn
pushed a commit
to mvanhorn/vllm
that referenced
this pull request
Jun 4, 2026
…t references it (vllm-project#44128) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com> Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
andakai
pushed a commit
to andakai/vllm
that referenced
this pull request
Jun 4, 2026
…t references it (vllm-project#44128) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
JisoLya
pushed a commit
to JisoLya/vllm
that referenced
this pull request
Jun 5, 2026
…t references it (vllm-project#44128) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com> Signed-off-by: JisoLya <523420504@qq.com>
vrdn-23
added a commit
to vrdn-23/vllm
that referenced
this pull request
Jun 5, 2026
Resolves the recurring envs.py merge conflict per docs/superpowers/specs/2026-05-14-envs-merge-conflict-resolution-design.md. The legacy `if TYPE_CHECKING:` block and `environment_variables: dict[str, Callable]` runtime mapping were dropped on the branch in favor of pydantic `*Settings(BaseSettings)` subclasses. Every main-side edit to either location therefore conflicts mechanically; structural resolution is `--ours` for vllm/envs.py, then port the semantic delta as new `Field(...)` declarations on the appropriate sub-model. Main-side commits since merge base afcb580, with port disposition: - c73b0d0 (vllm-project#44669) — adds VLLM_RAY_DP_PLACEMENT_NODE_IPS (str=""). Ported to DistributedSettings.ray_dp_placement_node_ips. - 165b786 (vllm-project#40426) — adds VLLM_ROCM_USE_AITER_LINEAR_HIPBMM (bool=False). Ported to RocmSettings.rocm_use_aiter_linear_hipbmm. Native pydantic bool parsing replaces the `.lower() in ("true","1")` lambda. - 38fd240 (vllm-project#41980) — adds VLLM_DISTRIBUTED_USE_SPLIT_GROUP (bool=False). Ported to DistributedSettings.distributed_use_split_group. Native pydantic bool parsing replaces the `bool(int(...))` lambda. - a618356 (vllm-project#43447) — adds VLLM_PREFIX_CACHE_RETENTION_INTERVAL (int|None=None, tri-state). Ported to ServerSettings.prefix_cache_retention_interval; pydantic's unset-vs-explicit-zero handling matches the original `"X" in os.environ` guard. - bd98e97 (vllm-project#44128) — removes dead VLLM_RPC_TIMEOUT. Mirrored on the branch by deleting ServerSettings.rpc_timeout. Verification: vllm.envs imports cleanly; all four new vars read defaults and parse env-set values (incl. tri-state INTERVAL=0); VLLM_RPC_TIMEOUT correctly raises AttributeError; pre-commit passes ruff/format/mypy. Signed-off-by: Vinay Damodaran <vrdn@hey.com>
This was referenced Jun 6, 2026
knight0528
pushed a commit
to knight0528/vllm
that referenced
this pull request
Jun 8, 2026
…t references it (vllm-project#44128) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
waqahmed-amd-fi
pushed a commit
to waqahmed-amd-fi/vllm
that referenced
this pull request
Jun 10, 2026
…t references it (vllm-project#44128) Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com> Signed-off-by: Waqar Ahmed <waqar.ahmed@amd.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
VLLM_RPC_TIMEOUTis documented invllm/envs.py("Time in ms for the zmq client to wait for a response from the backend server for simple data operations", default10000), but it has no consumers anywhere in the tree — it is a V0 leftover.In V1, the engine-core client waits on utility RPCs without any timeout:
SyncMPClient.call_utility→future.result()(no timeout arg)AsyncMPClient._call_utility_async→await future(no timeout)(both in
vllm/v1/engine/core_client.py). So there is no client-side RPC timeout for this env var to control.The problem this surfaces for users: the profiling guide actively tells people to set it —
Following that instruction has no effect, since nothing reads the variable. (In V1 the profiler-stop flush is awaited without a timeout anyway, so the original failure mode the tip was guarding against no longer exists.)
Changes
docs/contributing/profiling.md: replace the obsoleteVLLM_RPC_TIMEOUTinstruction with an accurate note — the engine client waits for the trace flush to complete without timing out.vllm/envs.py: remove the orphanedVLLM_RPC_TIMEOUTentry (type stub +environment_variableslambda).Net diff is
+1 / -6.Not a duplicate
Per
AGENTS.mdduplicate-work checks:No existing issue or PR addresses the dead env var or the obsolete profiling instruction.
Test plan
Verified there are zero consumers before removing (static and dynamic access), on
main@6bdabbad:pre-commit on the changed files (Windows host,
.venvPython 3.12.13):ruff check,ruff format,typos,markdownlint-cli2,mypy,Check SPDX headers,Validate configuration has default values and that each field has a docstring, and the remaining hooks pass on the changed files. (update-dockerfile-grapherrors withExecutable /bin/bash not found— a Windows-host limitation unrelated to these files; it runs on Linux CI.)Risk
Removing the env var is safe: nothing reads it, so no runtime path changes. A user who still exports
VLLM_RPC_TIMEOUTin their shell is simply left with an unused variable (same as today, since it was already ignored). If maintainers would instead prefer to restore a client-side RPC timeout in V1 and wire this variable back in, happy to take that direction instead.AI-assisted (Claude Code); reviewed end-to-end by the submitter.