Skip to content

[DSV4][XPU] Add MHC fused_post_pre support#44144

Open
majian4work wants to merge 1 commit into
vllm-project:mainfrom
majian4work:dsv4-pr5-mhc-fused-post-pre
Open

[DSV4][XPU] Add MHC fused_post_pre support#44144
majian4work wants to merge 1 commit into
vllm-project:mainfrom
majian4work:dsv4-pr5-mhc-fused-post-pre

Conversation

@majian4work
Copy link
Copy Markdown
Contributor

Summary

Add MHCFusedPostPreOp XPU support for DeepSeek-V4 on Intel XPU, enabling the fused MHC post+pre path in the decoder loop (matching the AMD/CUDA pattern).

Changes

  • vllm/model_executor/layers/mhc.py: Implement forward_native for MHCFusedPostPreOp (decomposes into mhc_post_torch + mhc_pre_torch); add forward_xpu delegating to forward_native.
  • vllm/models/deepseek_v4/xpu/model.py: Update decoder loop to use fused MHC path (first layer → standalone hc_pre, middle layers → mhc_fused_post_pre, explicit hc_post after loop). Add weight loading guards for truncated model testing.

Dependencies

⚠️ This PR depends on #42953 being merged first.

PR #42953 introduces the XPU attention decode path (dsv4-pr4-attention-decode) which this PR builds upon.

@mergify mergify Bot added intel-gpu Related to Intel GPU v1 labels Jun 1, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Jun 4, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @majian4work.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

- Add forward_xpu to MHCFusedPostPreOp (decomposes into mhc_post_torch + mhc_pre_torch)
- Update XPU model forward to use fused MHC path (matching AMD pattern):
  first layer uses standalone hc_pre, middle layers use mhc_fused_post_pre
- Add explicit hc_post after decoder loop

Signed-off-by: Ma Jian <jian1.ma@intel.com>
@majian4work majian4work force-pushed the dsv4-pr5-mhc-fused-post-pre branch from ed73f8c to 4d2a1c7 Compare June 8, 2026 05:51
@majian4work majian4work marked this pull request as ready for review June 8, 2026 05:52
@majian4work majian4work requested a review from zyongye as a code owner June 8, 2026 05:52
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot removed the needs-rebase label Jun 8, 2026
@majian4work
Copy link
Copy Markdown
Contributor Author

@jikunshang @xinyu-intel @wuxun-zhang Please help to take a review, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

intel-gpu Related to Intel GPU v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant