Skip to content

[FIX_FOR_VLLM_CUSTOM=dc6de33c3d5e9026cef7b27791dfe0f98e64bbde] Hourly fixes – batch no. 2#998

Merged
iboiko-habana merged 4 commits intovllm-project:mainfrom
pawel-olejniczak:dev/polejnix/fix_batch_2
Feb 23, 2026
Merged

[FIX_FOR_VLLM_CUSTOM=dc6de33c3d5e9026cef7b27791dfe0f98e64bbde] Hourly fixes – batch no. 2#998
iboiko-habana merged 4 commits intovllm-project:mainfrom
pawel-olejniczak:dev/polejnix/fix_batch_2

Conversation

@pawel-olejniczak
Copy link
Copy Markdown
Contributor

@pawel-olejniczak pawel-olejniczak commented Feb 20, 2026

This PR contains part of fixes from #903
Fixed issues:
AttributeError: '_OpNamespace' '_moe_C' object has no attribute 'topk_softmax'
AttributeError: 'HPUVocabParallelEmbeddingWithLoRA' object has no attribute 'quant_method'

Signed-off-by: Paweł Olejniczak <polejniczakx@habana.ai>
Signed-off-by: Paweł Olejniczak <polejniczakx@habana.ai>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request includes hourly automated fixes for the vLLM HPU (Habana Processing Unit) integration. The changes address two issues: adding a missing quant_method property delegation in the HPU LoRA implementation, and implementing a custom create_fused_moe_router factory function for HPU-specific MoE routing.

Changes:

  • Added quant_method property to HPUVocabParallelEmbeddingWithLoRA to delegate to the base layer
  • Implemented create_fused_moe_router factory function with support for multiple router types and patched it into vllm modules

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
vllm_gaudi/ops/hpu_lora.py Adds quant_method property delegation to fix AttributeError when accessing quantization methods on HPU LoRA layers
vllm_gaudi/ops/hpu_fused_moe.py Adds router factory function with imports and patches to support various MoE routing strategies on HPU

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread vllm_gaudi/ops/hpu_fused_moe.py Outdated
Comment thread vllm_gaudi/ops/hpu_fused_moe.py Outdated
Comment thread vllm_gaudi/ops/hpu_fused_moe.py Outdated
Signed-off-by: Paweł Olejniczak <polejniczakx@habana.ai>
@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
dc6de33c3d5e9026cef7b27791dfe0f98e64bbde

@iboiko-habana iboiko-habana merged commit 15d2664 into vllm-project:main Feb 23, 2026
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants