[Cleanup] Remove dead runtime.defaults config parameters#2343
Merged
hsliuustc0106 merged 1 commit intovllm-project:mainfrom Apr 17, 2026
Merged
Conversation
lishunyang12
approved these changes
Apr 2, 2026
Collaborator
lishunyang12
left a comment
There was a problem hiding this comment.
LGTM, clean removal. One missed file below.
fd83561 to
86001d4
Compare
Collaborator
|
@hsliuustc0106 PTAL |
Collaborator
The runtime.defaults block in stage config YAMLs contained two parameters — max_inflight and window_size — neither of which is read by any Python code. max_inflight was introduced in the MRS design (Nov 2025, PR vllm-project#51) to limit concurrent in-flight requests per stage. When the architecture was rewritten to the current orchestrator (Dec 2025, PR vllm-project#391), all Python references were removed but the YAML configs were left untouched. Concurrency is already managed by max_num_seqs (scheduler-level) for non-AR stages and KV cache capacity (memory-level) for AR stages. window_size (under both runtime.defaults and runtime.edges) is also not read by any Python code — the edge parsing in connector initialization only uses the 'from' and 'to' fields. Remove both parameters, the now-empty defaults block, and the dead window_size field from runtime.edges entries. The shm_threshold_bytes and other connector-level config under runtime.connectors.extra remain — those are actively used by SharedMemoryConnector. Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
86001d4 to
f242669
Compare
Contributor
Author
lishunyang12
approved these changes
Apr 17, 2026
lvliang-intel
pushed a commit
to lvliang-intel/vllm-omni
that referenced
this pull request
Apr 20, 2026
…t#2343) Signed-off-by: Nick Cao <ncao@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Remove dead
runtime.defaultsconfig parameters (max_inflightandwindow_size) from all stage config YAMLs, benchmark configs, test configs, platform configs, and documentation.Neither parameter is read by any Python code:
max_inflightwas introduced in PR #51 (Nov 2025, MRS design) to limit concurrent in-flight requests per stage. When the architecture was rewritten to the current orchestrator in PR #391 (Dec 2025), all Python references were removed but the YAML configs were left untouched. Concurrency is already managed bymax_num_seqs(scheduler-level) for non-AR stages and KV cache capacity (memory-level) for AR stages.window_size(under bothruntime.defaultsandruntime.edges[]) is also not read by any Python code — the edge parsing in connector initialization (initialization.py:225-231) only uses thefromandtofields.Some benchmark configs set
max_inflightto 4/8/16 (e.g. PR #1852, PR #1913) expecting it to control concurrency, but it has no effect.The
shm_threshold_bytesand other connector-level config underruntime.connectors.extraremain untouched — those are actively used bySharedMemoryConnector.Test Plan
Config-only change (YAML + docs). No Python code modified.
Test Result
No Python behavior change — parameters were never read by any code.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)