fix bugs when token_classify & classify run concurrently#36614
fix bugs when token_classify & classify run concurrently#36614vllm-bot merged 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request aims to resolve a critical concurrency bug that caused RuntimeError during simultaneous token_classify and classify operations due to tensor dimension mismatches. While the updated code correctly utilizes first_token_indices_gpu and last_token_indices_gpu for slicing hidden_states to address the crash, it introduces a critical security vulnerability: cross-user data leakage. The current slicing logic uses relative indices, which can lead to tasks incorrectly pooling tokens from the beginning of the batch, potentially exposing one user's data to another. This requires correction by using absolute indices or proper input tensor slicing.
Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
Head branch was pushed to by a user without write access
514e489 to
406f4bd
Compare
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
…t#36614) Signed-off-by: augusto.yjh <augusto.yjh@antgroup.com>
Purpose
Fix bugs for
*For*Classification, *ClassificationModelmodels that runstoken_classifyandclassifyconcurrently.vllm version 0.17.0
steps to reproduce
Error log shows that
hidden_statesandpooling_cursor.num_scheduled_tokens_cpumismatched. FromDispatchPooler.forwardpasses wholehidden_statesof a batch toAllPool, whilenum_scheduled_tokens_cpuinpooling_metadatset invllm/vllm/v1/pool/metadata.py
Line 31 in ddbb0d2
vllm/vllm/model_executor/layers/pooler/special.py
Line 103 in ddbb0d2
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.