fix: replace batched_count_greater_than to avoid dynamic shape TypeError on HPU#1412
Conversation
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
There was a problem hiding this comment.
Pull request overview
This PR updates the HPU plugin’s runtime monkey-patches to avoid a Habana HPU recipe_compiler failure caused by upstream @torch.compile(dynamic=True) on vllm.v1.sample.ops.logprobs.batched_count_greater_than, by replacing it with an uncompiled equivalent and deferring the patch until after platform initialization.
Changes:
- Add an HPU-safe (non-compiled) replacement for
batched_count_greater_than. - Defer patch application by wrapping
vllm.plugins.load_general_pluginsso the heavyvllm.v1.sample.*import chain runs after platform init. - Update the patch module documentation and slightly refactor existing monkey-patch assignments for readability.
e5d419e to
c994e99
Compare
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
✅ CI PassedAll checks passed successfully against the following vllm commit: |
f0fb14d to
772d9f3
Compare
…ror on HPU Upstream vLLM decorates batched_count_greater_than with @torch.compile(dynamic=True), which causes Habana's recipe_compiler to raise 'TypeError: Cannot convert symbols to int' when processing symbolic shapes. Additionally, mark_unbacked in the caller (gather_logprobs) prevents dynamic=False from being a viable alternative. Replace with a plain (uncompiled) version of the same function. The patching is deferred to load_general_plugins time via a hook on vllm.plugins.load_general_plugins, because importing vllm.v1.sample.sampler during early plugin registration triggers a heavy import chain that interferes with platform initialisation. Fixes test_skip_tokenizer_initialization and other v1 engine tests that exercise prompt logprobs on HPU. Signed-off-by: Kamil Kaczor <kamil.kaczor@intel.com>
772d9f3 to
58b71e2
Compare
✅ CI PassedAll checks passed successfully against the following vllm commit: |
Summary
Upstream vLLM decorates
batched_count_greater_thanwith@torch.compile(dynamic=True), which causes Habana'srecipe_compilerto raiseTypeError: Cannot convert symbols to intwhen processing symbolic shapes. Additionally,mark_unbackedin the caller (gather_logprobs) preventsdynamic=Falsefrom being a viable alternative.Fix
Replace with a plain (uncompiled) version of the same function. The patching is deferred to
load_general_pluginstime via a hook onvllm.plugins.load_general_plugins, because importingvllm.v1.sample.samplerduring early plugin registration triggers a heavy import chain that interferes with platform initialisation.Why deferred patching?
vllm.v1.sample.samplerduringapply()(called fromregister()) triggers a heavy import chain that resets platform detection, causingDevice string must not be empty.load_general_pluginswhich runs in every process (parent + EngineCore subprocess) after the platform is ready.sampler.pyusesfrom ... import batched_count_greater_thanwhich creates a module-level global resolved viaLOAD_GLOBALat call time, so patching the module attribute works.Testing
test_skip_tokenizer_initializationPASSEStest_engine_args(3 tests) PASSlogprobs=5produces correct output