[BugFix] fix keyword argument max_num_tokens_across_dp missing #1398

zhangxinyuehfad · 2025-06-24T09:26:11Z

What this PR does / why we need it?

fix keyword argument max_num_tokens_across_dp missing

Does this PR introduce any user-facing change?

How was this patch tested?

vllm serve Qwen/Qwen2.5-7B-Instruct --max_model_len 4096 --tensor_parallel_size 2 --data_parallel_size 2

log error:

INFO:     Started server process [12747]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO 06-24 09:08:01 [logger.py:43] Received request cmpl-98a67bfb86a34b8db6d83ad3489ea45b-0: prompt: '西安有什么好玩的地方', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=20, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None), prompt_token_ids: [102178, 104139, 108257, 103958], prompt_embeds shape: None, lora_request: None, prompt_adapter_request: None.
INFO 06-24 09:08:01 [async_llm.py:271] Added request cmpl-98a67bfb86a34b8db6d83ad3489ea45b-0.
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527] WorkerProc hit an exception.
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527] Traceback (most recent call last):
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]   File "/__w/vllm-benchmarks/vllm-benchmarks/vllm-empty/vllm/v1/executor/multiproc_executor.py", line 522, in worker_busy_loop
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]     output = func(*args, **kwargs)
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]   File "/__w/vllm-benchmarks/vllm-benchmarks/vllm-ascend/vllm_ascend/worker/worker_v1.py", line 178, in execute_model
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]     output = self.model_runner.execute_model(scheduler_output)
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]   File "/usr/local/python3.10.17/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]   File "/__w/vllm-benchmarks/vllm-benchmarks/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1472, in execute_model
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]     aux_hidden_states) = (self._process_reqs(scheduler_output,
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]   File "/__w/vllm-benchmarks/vllm-benchmarks/vllm-ascend/vllm_ascend/worker/model_runner_v1.py", line 1085, in _process_reqs
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527]     attn_metadata = self.attn_metadata_builder.build(  # type: ignore
(VllmWorker rank=0 pid=13597) ERROR 06-24 09:08:01 [multiproc_executor.py:527] TypeError: AscendAttentionMetadataBuilder.build() got an unexpected keyword argument 'max_num_tokens_across_dp'
(EngineCore_0 pid=13068) ERROR 06-24 09:08:01 [dump_input.py:69] Dumping input data

fixed:

INFO:     Started server process [15266]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO 06-24 09:16:56 [logger.py:43] Received request cmpl-f629ffbc739c4b4ba599d873d3583056-0: prompt: '西安有什么好玩的地方', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.05, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=20, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None, extra_args=None), prompt_token_ids: [102178, 104139, 108257, 103958], prompt_embeds shape: None, lora_request: None, prompt_adapter_request: None.
INFO 06-24 09:16:56 [async_llm.py:271] Added request cmpl-f629ffbc739c4b4ba599d873d3583056-0.
INFO:     127.0.0.1:37616 - "POST /v1/completions HTTP/1.1" 200 OK

Signed-off-by: hfadzxy <[email protected]>

codecov · 2025-06-24T09:56:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 27.21%. Comparing base (c30ddb8) to head (eb4faac).
⚠️ Report is 541 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1398      +/-   ##
==========================================
- Coverage   27.39%   27.21%   -0.19%     
==========================================
  Files          56       56              
  Lines        6191     6214      +23     
==========================================
- Hits         1696     1691       -5     
- Misses       4495     4523      +28

Flag	Coverage Δ
unittests	`27.21% <ø> (-0.19%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

MengqingCao · 2025-06-24T13:44:46Z

duplicated with #1273

zhangxinyuehfad · 2025-06-24T15:25:02Z

duplicated with #1273 重复了 #1273

I will close it

[BugFix] fix keyword argument max_num_tokens_across_dp missing

eb4faac

Signed-off-by: hfadzxy <[email protected]>

zhangxinyuehfad closed this Jun 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] fix keyword argument max_num_tokens_across_dp missing #1398

[BugFix] fix keyword argument max_num_tokens_across_dp missing #1398

Uh oh!

zhangxinyuehfad commented Jun 24, 2025

Uh oh!

codecov bot commented Jun 24, 2025 •

edited

Loading

Uh oh!

MengqingCao commented Jun 24, 2025

Uh oh!

zhangxinyuehfad commented Jun 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[BugFix] fix keyword argument max_num_tokens_across_dp missing #1398

[BugFix] fix keyword argument max_num_tokens_across_dp missing #1398

Uh oh!

Conversation

zhangxinyuehfad commented Jun 24, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

log error:

Uh oh!

codecov bot commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MengqingCao commented Jun 24, 2025

Uh oh!

zhangxinyuehfad commented Jun 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jun 24, 2025 •

edited

Loading