Skip to content

[Cleanup] Remove dead runtime.defaults config parameters#2343

Merged
hsliuustc0106 merged 1 commit intovllm-project:mainfrom
NickCao:cleanup/remove-dead-max-inflight
Apr 17, 2026
Merged

[Cleanup] Remove dead runtime.defaults config parameters#2343
hsliuustc0106 merged 1 commit intovllm-project:mainfrom
NickCao:cleanup/remove-dead-max-inflight

Conversation

@NickCao
Copy link
Copy Markdown
Contributor

@NickCao NickCao commented Mar 30, 2026

Purpose

Remove dead runtime.defaults config parameters (max_inflight and window_size) from all stage config YAMLs, benchmark configs, test configs, platform configs, and documentation.
Neither parameter is read by any Python code:

  • max_inflight was introduced in PR #51 (Nov 2025, MRS design) to limit concurrent in-flight requests per stage. When the architecture was rewritten to the current orchestrator in PR #391 (Dec 2025), all Python references were removed but the YAML configs were left untouched. Concurrency is already managed by max_num_seqs (scheduler-level) for non-AR stages and KV cache capacity (memory-level) for AR stages.
  • window_size (under both runtime.defaults and runtime.edges[]) is also not read by any Python code — the edge parsing in connector initialization (initialization.py:225-231) only uses the from and to fields.
    Some benchmark configs set max_inflight to 4/8/16 (e.g. PR #1852, PR #1913) expecting it to control concurrency, but it has no effect.
    The shm_threshold_bytes and other connector-level config under runtime.connectors.extra remain untouched — those are actively used by SharedMemoryConnector.

Test Plan

Config-only change (YAML + docs). No Python code modified.

# Verify no Python code references these parameters
grep -rn 'max_inflight' vllm_omni/ --include='*.py'
# expect: 0 results
grep -rn 'window_size' vllm_omni/config/ vllm_omni/engine/ vllm_omni/distributed/ --include='*.py'
# expect: 0 results (runtime config; unrelated attention/audio window_size remain)

Test Result

No Python behavior change — parameters were never read by any code.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@NickCao NickCao requested a review from hsliuustc0106 as a code owner March 30, 2026 18:51
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, clean removal. One missed file below.

Comment thread vllm_omni/model_executor/stage_configs/bagel.yaml
@lishunyang12
Copy link
Copy Markdown
Collaborator

@hsliuustc0106 PTAL

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

@NickCao can you help resolve conflicts first? I'll get it merged before #2383

The runtime.defaults block in stage config YAMLs contained two
parameters — max_inflight and window_size — neither of which is
read by any Python code.

max_inflight was introduced in the MRS design (Nov 2025, PR vllm-project#51) to
limit concurrent in-flight requests per stage.  When the architecture
was rewritten to the current orchestrator (Dec 2025, PR vllm-project#391), all
Python references were removed but the YAML configs were left
untouched.  Concurrency is already managed by max_num_seqs
(scheduler-level) for non-AR stages and KV cache capacity
(memory-level) for AR stages.

window_size (under both runtime.defaults and runtime.edges) is also
not read by any Python code — the edge parsing in connector
initialization only uses the 'from' and 'to' fields.

Remove both parameters, the now-empty defaults block, and the dead
window_size field from runtime.edges entries.  The shm_threshold_bytes
and other connector-level config under runtime.connectors.extra remain
— those are actively used by SharedMemoryConnector.

Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
@NickCao NickCao force-pushed the cleanup/remove-dead-max-inflight branch from 86001d4 to f242669 Compare April 17, 2026 14:01
@NickCao
Copy link
Copy Markdown
Contributor Author

NickCao commented Apr 17, 2026

@NickCao can you help resolve conflicts first? I'll get it merged before #2383

Rebased and fixed many newly added stage configs.

@lishunyang12 lishunyang12 enabled auto-merge (squash) April 17, 2026 14:17
@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Apr 17, 2026
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rebase #2383 after the pr is merged.

@hsliuustc0106 hsliuustc0106 disabled auto-merge April 17, 2026 15:36
@hsliuustc0106 hsliuustc0106 merged commit f2edb81 into vllm-project:main Apr 17, 2026
6 of 8 checks passed
lvliang-intel pushed a commit to lvliang-intel/vllm-omni that referenced this pull request Apr 20, 2026
…t#2343)

Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants