Revert "skip HPU graphs for long prefills"#850
Conversation
This reverts commit b208bbd.
There was a problem hiding this comment.
Pull request overview
This PR reverts a previous change that skipped HPU graphs for long prefill operations, restoring the simpler graph skipping logic based solely on max_cudagraph_capture_size.
Changes:
- Simplified the condition for skipping HPU graphs from a complex formula considering sequence length and block size to a simpler check against
max_cudagraph_capture_size - Moved
max_cudagraph_capture_sizeassignment earlier in initialization and removed themax_graph_capture_tokensvariable - Reformatted
max_num_batched_tokensassignment for better readability
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
✅ CI PassedAll checks passed successfully against the following vllm commit: |
Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Reverts #780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: Wang, Zheng W <zheng.w.wang@intel.com>
…roject#888) Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: slokesha <slokeshappa@habana.ai>
|
@adobrzyn What is the motivation of this PR? |
Reverts vllm-project#780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: slokesha <slokeshappa@habana.ai>
Reverts #780 --------- Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai> Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Reverts #780