[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation by yma11 · Pull Request #30538 · vllm-project/vllm

yma11 · 2025-12-12T06:05:04Z

Purpose

decrease triton kernel compilation scratch space for speculative decoding, work around for error:

L0 build module failed. Log:
warning: [RetryManager] Start recompilation of the kernel
in kernel: 'sample_recovered_tokens_kernel'

error: total scratch space exceeds HW supported limit for kernel sample_recovered_tokens_kernel: 1164736 bytes (max permitted PTSS 262144 bytes)
error: backend compiler failed build.

Error during Intel loadBinary: ZE_RESULT_ERROR_MODULE_BUILD_FAILURE

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request introduces a workaround for a Triton kernel compilation error on XPU devices during speculative decoding by setting the IGC_ForceOCLSIMDWidth environment variable to 16 in the XPU Dockerfile. While this resolves the compilation failure, setting this globally may have unintended performance consequences. My review includes a suggestion for a more targeted approach to apply this workaround only when necessary.

gemini-code-assist · 2025-12-12T06:06:13Z

docker/Dockerfile.xpu

@@ -76,6 +76,9 @@ RUN python3 -m pip install -e tests/vllm_test_utils
 ENV NIXL_VERSION=0.7.0
 RUN python3 /workspace/vllm/tools/install_nixl_from_source_ubuntu.py

+# decrease triton kernel compilation scratch space for speculative decoding
+ENV IGC_ForceOCLSIMDWidth=16


Setting IGC_ForceOCLSIMDWidth as a global environment variable in the Docker image is a broad change that will affect all Triton kernels compiled at runtime, not just those for speculative decoding. This could lead to performance degradation for workloads that do not use speculative decoding, or for other kernels that could benefit from a wider SIMD width.

A more targeted approach would be to set this environment variable dynamically within the vLLM Python code, only when speculative decoding is enabled on an XPU device. A suitable location for this logic could be within vllm.platforms.xpu.XPUPlatform.check_and_update_config, checking if vllm_config.speculative_config is present.

This would scope the workaround to only when it's needed, avoiding potential performance impacts on other use cases. While I cannot suggest code for an un-modified file, please consider this alternative for a more robust solution.

valid comment

yes. updated.

mergify · 2025-12-16T14:31:11Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yma11.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-12-22T05:47:57Z

Hi @yma11, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

…ding Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Yan Ma <yan.ma@intel.com>

mergify · 2025-12-23T02:13:27Z

Documentation preview: https://vllm--30538.org.readthedocs.build/en/30538/

Signed-off-by: Yan Ma <yan.ma@intel.com>

jikunshang

LGTM, thanks for fixing

…xpu kernel compilation (vllm-project#30538) Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>

…xpu kernel compilation (vllm-project#30538) Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>

…xpu kernel compilation (vllm-project#30538) Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 requested a review from jikunshang as a code owner December 12, 2025 06:05

yma11 marked this pull request as draft December 12, 2025 06:05

mergify bot added the ci/build label Dec 12, 2025

gemini-code-assist bot reviewed Dec 12, 2025

View reviewed changes

mergify bot added the needs-rebase label Dec 16, 2025

yma11 force-pushed the spec-decode-wa branch from 53aecb8 to 8733724 Compare December 20, 2025 10:35

yma11 marked this pull request as ready for review December 20, 2025 10:35

mergify bot removed the needs-rebase label Dec 20, 2025

yma11 force-pushed the spec-decode-wa branch from 8733724 to 853e839 Compare December 22, 2025 00:32

yma11 mentioned this pull request Dec 22, 2025

[Feature][XPU]: speculative decoding support on XPU. #26963

Open

1 task

yma11 added 2 commits December 23, 2025 02:12

decrease triton kernel compilation scratch space for speculative deco…

7cafe4d

…ding Signed-off-by: Yan Ma <yan.ma@intel.com>

address comments

a9f2a16

Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 force-pushed the spec-decode-wa branch from affd04e to 1fd4cda Compare December 23, 2025 02:12

mergify bot added the documentation Improvements or additions to documentation label Dec 23, 2025

yma11 force-pushed the spec-decode-wa branch from 1fd4cda to 522a3f3 Compare December 23, 2025 02:21

update

ae350cd

Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 force-pushed the spec-decode-wa branch from 522a3f3 to ae350cd Compare December 23, 2025 03:05

jikunshang approved these changes Dec 23, 2025

View reviewed changes

jikunshang enabled auto-merge (squash) December 23, 2025 03:09

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 23, 2025

rogerxfeng8 approved these changes Dec 23, 2025

View reviewed changes

jikunshang merged commit f1c2c20 into vllm-project:main Dec 23, 2025
45 checks passed

ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026

[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-…

1d71741

…xpu kernel compilation (vllm-project#30538) Signed-off-by: Yan Ma <yan.ma@intel.com>

This was referenced Feb 25, 2026

[BugFix][XPU] Fix speculative decoding on Intel XPU due to bug with IGC_ForceOCLSIMDWidth=16 #35298

Merged

IGC_ForceOCLSIMDWidth=16 Breaks ALL Masked Triton Stores on BMG intel/intel-xpu-backend-for-triton#6206

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation#30538

[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation#30538
jikunshang merged 3 commits intovllm-project:mainfrom
yma11:spec-decode-wa

yma11 commented Dec 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 12, 2025

Uh oh!

rogerxfeng8 Dec 22, 2025

Uh oh!

yma11 Dec 22, 2025

Uh oh!

mergify bot commented Dec 16, 2025

Uh oh!

mergify bot commented Dec 22, 2025

Uh oh!

mergify bot commented Dec 23, 2025

Uh oh!

jikunshang left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yma11 commented Dec 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

rogerxfeng8 Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

yma11 Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Dec 16, 2025

Uh oh!

mergify bot commented Dec 22, 2025

Uh oh!

mergify bot commented Dec 23, 2025

Uh oh!

jikunshang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yma11 commented Dec 12, 2025 •

edited by github-actions bot

Loading