Revert "[BugFix] Support setting tp=1 for the Eagle draft model to ta…#5903
Revert "[BugFix] Support setting tp=1 for the Eagle draft model to ta…#5903wangxiyuan merged 1 commit intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request reverts a previous commit that added support for setting tp=1 for the Eagle draft model. While most of the changes correctly remove the related logic and test cases, there appears to be a critical issue in vllm_ascend/worker/model_runner_v1.py. The revert removes necessary handling for UniformTypeKVCacheSpecs, which could lead to runtime errors. It's likely that this part of the original commit was a separate bug fix and should not be reverted.
| if isinstance(kv_cache_group.kv_cache_spec, | ||
| EncoderOnlyAttentionSpec): | ||
| continue | ||
| elif isinstance(kv_cache_spec, AttentionSpec): | ||
| elif isinstance(kv_cache_group.kv_cache_spec, AttentionSpec): |
There was a problem hiding this comment.
This part of the revert seems to have incorrectly removed the handling for UniformTypeKVCacheSpecs. Without this logic, if a kv_cache_group.kv_cache_spec is an instance of UniformTypeKVCacheSpecs, it will not be processed in this loop, leading to missing entries in kernel_block_sizes. This will likely cause errors later on.
Additionally, the list comprehension for block_sizes on line 2687 will raise an AttributeError because UniformTypeKVCacheSpecs does not have a block_size attribute.
It seems this part of the original commit was a necessary fix and should not be reverted. Please consider restoring the logic for handling UniformTypeKVCacheSpecs.
kv_cache_spec = kv_cache_group.kv_cache_spec
if isinstance(kv_cache_spec, UniformTypeKVCacheSpecs):
# All layers in the UniformTypeKVCacheSpecs have the same type,
# Pick an arbitrary one to dispatch.
kv_cache_spec = next(
iter(kv_cache_spec.kv_cache_specs.values()))
if isinstance(kv_cache_spec, EncoderOnlyAttentionSpec):
continue
elif isinstance(kv_cache_spec, AttentionSpec):…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (vllm-project#5903) This reverts commit d886b81 - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@bde38c1 Signed-off-by: zhaomingyu <zhaomingyu13@h-partners.com>
…ke effect (#5519)"
This reverts commit d886b81.
What this PR does / why we need it?
Does this PR introduce any user-facing change?
How was this patch tested?