Skip to content

fix: replace batched_count_greater_than to avoid dynamic shape TypeError on HPU#1412

Merged
kamil-kaczor merged 2 commits into
vllm-project:mainfrom
kamil-kaczor:fix/batched-count-dynamic-shapes
May 18, 2026
Merged

fix: replace batched_count_greater_than to avoid dynamic shape TypeError on HPU#1412
kamil-kaczor merged 2 commits into
vllm-project:mainfrom
kamil-kaczor:fix/batched-count-dynamic-shapes

Conversation

@kamil-kaczor
Copy link
Copy Markdown
Collaborator

Summary

Upstream vLLM decorates batched_count_greater_than with @torch.compile(dynamic=True), which causes Habana's recipe_compiler to raise TypeError: Cannot convert symbols to int when processing symbolic shapes. Additionally, mark_unbacked in the caller (gather_logprobs) prevents dynamic=False from being a viable alternative.

Fix

Replace with a plain (uncompiled) version of the same function. The patching is deferred to load_general_plugins time via a hook on vllm.plugins.load_general_plugins, because importing vllm.v1.sample.sampler during early plugin registration triggers a heavy import chain that interferes with platform initialisation.

Why deferred patching?

  • Importing vllm.v1.sample.sampler during apply() (called from register()) triggers a heavy import chain that resets platform detection, causing Device string must not be empty.
  • The patching hooks into load_general_plugins which runs in every process (parent + EngineCore subprocess) after the platform is ready.
  • sampler.py uses from ... import batched_count_greater_than which creates a module-level global resolved via LOAD_GLOBAL at call time, so patching the module attribute works.

Testing

  • test_skip_tokenizer_initialization PASSES
  • test_engine_args (3 tests) PASS
  • Inference with logprobs=5 produces correct output

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the HPU plugin’s runtime monkey-patches to avoid a Habana HPU recipe_compiler failure caused by upstream @torch.compile(dynamic=True) on vllm.v1.sample.ops.logprobs.batched_count_greater_than, by replacing it with an uncompiled equivalent and deferring the patch until after platform initialization.

Changes:

  • Add an HPU-safe (non-compiled) replacement for batched_count_greater_than.
  • Defer patch application by wrapping vllm.plugins.load_general_plugins so the heavy vllm.v1.sample.* import chain runs after platform init.
  • Update the patch module documentation and slightly refactor existing monkey-patch assignments for readability.

@kamil-kaczor kamil-kaczor force-pushed the fix/batched-count-dynamic-shapes branch from e5d419e to c994e99 Compare May 5, 2026 08:43
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
54f548e9e58087f0155e4e164e416ad7efdfde6d

@kamil-kaczor kamil-kaczor force-pushed the fix/batched-count-dynamic-shapes branch from f0fb14d to 772d9f3 Compare May 13, 2026 08:11
…ror on HPU

Upstream vLLM decorates batched_count_greater_than with
@torch.compile(dynamic=True), which causes Habana's recipe_compiler
to raise 'TypeError: Cannot convert symbols to int' when processing
symbolic shapes.  Additionally, mark_unbacked in the caller
(gather_logprobs) prevents dynamic=False from being a viable
alternative.

Replace with a plain (uncompiled) version of the same function.
The patching is deferred to load_general_plugins time via a hook
on vllm.plugins.load_general_plugins, because importing
vllm.v1.sample.sampler during early plugin registration triggers
a heavy import chain that interferes with platform initialisation.

Fixes test_skip_tokenizer_initialization and other v1 engine tests
that exercise prompt logprobs on HPU.

Signed-off-by: Kamil Kaczor <kamil.kaczor@intel.com>
@kamil-kaczor kamil-kaczor force-pushed the fix/batched-count-dynamic-shapes branch from 772d9f3 to 58b71e2 Compare May 13, 2026 13:19
@github-actions
Copy link
Copy Markdown

✅ CI Passed

All checks passed successfully against the following vllm commit:
54f548e9e58087f0155e4e164e416ad7efdfde6d

@kamil-kaczor kamil-kaczor merged commit 27c367b into vllm-project:main May 18, 2026
2 checks passed
mgawarkiewicz-intel pushed a commit that referenced this pull request May 25, 2026
…pe TypeError on HPU #1412 (#1458)

Signed-off-by: Iryna Boiko <iboiko@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants