Conversation
…attention #32238 Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
… you mean: 'input_size' Signed-off-by: root <root@adobrzyn-31x3-g3-mpijob-worker-0.adobrzyn-31x3-g3-mpijob-worker.framework.svc.cluster.local>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
There was a problem hiding this comment.
Pull request overview
This PR updates import paths to align with the latest version of vLLM, addressing breaking changes from upstream vLLM updates. The changes primarily reorganize imports for the FusedMoE router components and remove an unused parameter from a function call.
Changes:
- Updated import paths for
GroupedTopkandFusedMoERouterto reflect new module structure in vLLM - Removed
input_scaleparameter fromapply_block_fp8_linear_hpufunction call - Added FP4 BMM (Block Matrix Multiply) support check in the attention backend
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| vllm_gaudi/ops/hpu_fused_moe.py | Updated imports for GroupedTopk and FusedMoERouter to new module paths |
| vllm_gaudi/ops/hpu_fp8.py | Updated FusedMoERouter import path |
| vllm_gaudi/ops/hpu_compressed_tensors.py | Updated FusedMoERouter import path |
| vllm_gaudi/extension/ops.py | Removed input_scale parameter from function call |
| vllm_gaudi/attention/backends/hpu_attn.py | Added FP4 BMM enablement check with dtype validation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: root <root@adobrzyn-9z1k-g3-mpijob-worker-0.adobrzyn-9z1k-g3-mpijob-worker.framework.svc.cluster.local>
Signed-off-by: Agata Dobrzyniewicz <160237065+adobrzyn@users.noreply.github.com>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
|
everything is delivered in #876 |
#833
and
vllm-project/vllm#27814