Revert "skip HPU graphs for long prefills" (#850) by adobrzyn · Pull Request #888 · vllm-project/vllm-gaudi

adobrzyn · 2026-01-27T08:46:05Z

Reverts #780

Reverts #780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>

Copilot

Pull request overview

This PR reverts a previous change that skipped HPU graphs for long prefills (#780). The revert simplifies the graph capture decision logic and modifies test configurations.

Changes:

Reverted the logic for determining when to bypass HPU graphs, replacing a complex condition involving sequence length and batched tokens with a simpler check based on max_cudagraph_capture_size
Updated test configurations by reducing max-model-len in performance tests and adding it to GSM8K tests

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
vllm_gaudi/v1/worker/hpu_model_runner.py	Simplified graph capture logic and removed `max_graph_capture_tokens` variable
tests/full_tests/ci_perf_tests.sh	Reduced max-model-len from 32768 to 16384
tests/full_tests/ci_gsm8k_tests.sh	Added max-model-len parameter (131072) to Qwen3 MOE test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2026-01-27T14:04:26Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
d7de043d55d1dd629554467e23874097e1c48993

…roject#888) Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: slokesha <slokeshappa@habana.ai>

This reverts commit c66a038.

Revert "skip HPU graphs for long prefills" (#850)

102e4a7

Reverts #780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>

adobrzyn requested review from mgawarkiewicz-intel, piotrbocian and wpyszka as code owners January 27, 2026 08:46

Copilot AI review requested due to automatic review settings January 27, 2026 08:46

Copilot AI reviewed Jan 27, 2026

View reviewed changes

Comment thread vllm_gaudi/v1/worker/hpu_model_runner.py

Comment thread vllm_gaudi/v1/worker/hpu_model_runner.py

github-actions Bot mentioned this pull request Jan 27, 2026

🚦 Team Review Dashboard #701

Open

mgawarkiewicz-intel approved these changes Jan 28, 2026

View reviewed changes

mgawarkiewicz-intel merged commit c66a038 into releases/v0.14.1 Jan 28, 2026
53 checks passed

czhu15 added a commit that referenced this pull request Feb 10, 2026

Revert "Revert "skip HPU graphs for long prefills" (#850) (#888)"

c31c582

This reverts commit c66a038.

yangulei added a commit that referenced this pull request Feb 24, 2026

Revert "Revert "skip HPU graphs for long prefills" (#850) (#888)"

4294027

This reverts commit c66a038.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "skip HPU graphs for long prefills" (#850)#888

Revert "skip HPU graphs for long prefills" (#850)#888
mgawarkiewicz-intel merged 1 commit into
releases/v0.14.1from
adobrzyn/revert_780_for_release0141

adobrzyn commented Jan 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jan 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

adobrzyn commented Jan 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jan 27, 2026

✅ CI Passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants