Skip to content

[Fix] Add missing stubs from cpu fp8 attention changes#41387

Merged
bigPYJ1151 merged 9 commits intovllm-project:mainfrom
tianmu-li:fix/fp8-attn-non-x86-stubs
May 6, 2026
Merged

[Fix] Add missing stubs from cpu fp8 attention changes#41387
bigPYJ1151 merged 9 commits intovllm-project:mainfrom
tianmu-li:fix/fp8-attn-non-x86-stubs

Conversation

@tianmu-li
Copy link
Copy Markdown
Contributor

@tianmu-li tianmu-li commented Apr 30, 2026

Purpose

#39445 is missing some stubs for fp8 attention, which cause compilation errors when using clang (see https://github.com/tianmu-li/vllm/actions/runs/25149522905/job/73716695058#logs). This PR adds them.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
@tianmu-li tianmu-li requested a review from bigPYJ1151 as a code owner April 30, 2026 16:01
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@mergify mergify Bot added the cpu Related to CPU backends label Apr 30, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 30, 2026

Hi @tianmu-li, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds FP8 stub constructors and type tags to various CPU architecture headers (ARM, Scalar, VSX, and VXE) to facilitate cross-platform compilation of shared templates. Feedback highlights that the stubs in the Scalar, VSX, and VXE files incorrectly attempt to use a Base alias that is not defined, which will cause compilation errors; these should be updated to initialize the reg member directly.

Comment thread csrc/cpu/cpu_types_scalar.hpp Outdated
Comment thread csrc/cpu/cpu_types_vsx.hpp Outdated
Comment thread csrc/cpu/cpu_types_vxe.hpp Outdated
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
@tianmu-li
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces FP8 stubs and tag structures across several CPU architecture-specific headers, including ARM, Scalar, VSX (PowerPC), and VXE (s390x). These additions provide the necessary constructor overloads to allow the load_b_pair_vec template to compile on all platforms, even though FP8 KV cache functionality is currently restricted to x86 architectures. I have no feedback to provide as there were no review comments to evaluate.

@tianmu-li
Copy link
Copy Markdown
Contributor Author

@bigPYJ1151 Appreciate it if you could take a look. This is to address compilation issues using clang (found on m1 smoke test) and add some missing stubs.

@mcsantiago
Copy link
Copy Markdown
Contributor

Tested on macOS arm64 (Apple Silicon, Apple Clang 21) and the build now succeeds:

  • Branch tested: fix/fp8-attn-non-x86-stubs @ a6ba62f
  • Build: CC=/usr/bin/clang CXX=/usr/bin/clang++ uv pip install -e . --no-build-isolation clean (~60s)
  • import vllm._C works
  • Original errors at csrc/cpu/cpu_attn_vec.hpp:23,27 are gone

Filed issue #41437 reproducing the same error before noticing this PR was already in flight — closing the loop here so the cross-reference exists. Thanks for the quick fix!

@bigPYJ1151 bigPYJ1151 enabled auto-merge (squash) May 1, 2026 04:48
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 1, 2026
@bigPYJ1151 bigPYJ1151 added the verified Run pre-commit for new contributors without triggering other tests label May 1, 2026
@whk-lab
Copy link
Copy Markdown

whk-lab commented May 1, 2026

Confirmed this fixes the macOS arm64 source build failure I hit on main.

Environment:

  • macOS arm64 / Apple Silicon
  • Python 3.12
  • torch 2.11.0, torch.version.cuda=None

Before this patch, _C failed to build with:
no matching constructor for initialization of 'vec_op::FP32Vec16'

After applying the cpu_types_arm.hpp stub from this PR:
cmake --build . -j=14 --target=_C succeeds and links _C.abi3.so.

auto-merge was automatically disabled May 5, 2026 05:04

Head branch was pushed to by a user without write access

@tianmu-li
Copy link
Copy Markdown
Contributor Author

@bigPYJ1151 Appreciate it if you could help merge this. I had to push a commit to resolve a merge conflict, which disabled auto-merge.

@bigPYJ1151 bigPYJ1151 merged commit e47c98e into vllm-project:main May 6, 2026
19 checks passed
chaojun-zhang pushed a commit to chaojun-zhang/vllm that referenced this pull request May 6, 2026
…41387)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
amd-mghanimi pushed a commit to amd-mghanimi/vllm that referenced this pull request May 6, 2026
…41387)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
Signed-off-by: Mehdi Ghanimifard <mehdi.ghanimifard@amd.com>
ikaadil pushed a commit to ikaadil/vllm that referenced this pull request May 7, 2026
…41387)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
Signed-off-by: Ifta Khairul Alam Adil <ikaadil007@gmail.com>
libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026
…41387)

Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
Signed-off-by: Libin Tang <libin.tang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpu Related to CPU backends ready ONLY add when PR is ready to merge/full CI is needed verified Run pre-commit for new contributors without triggering other tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants