[0.13.0][cherry-pick][bugfix]Synchronize memcache adaptation on A2 by DreamerLeader · Pull Request #5842 · vllm-project/vllm-ascend

DreamerLeader · 2026-01-13T06:43:18Z

What this PR does / why we need it?

When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance.

pick-from: #5601

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>

gemini-code-assist

Code Review

This pull request synchronizes memcache adaptations for A2 hardware. The changes introduce device-specific logic for A2 in memcache_backend.py, improve the robustness of token allocation calculation in pool_scheduler.py, and fix a path parsing issue in pool_worker.py. My review includes two main points: first, I've identified significant code duplication in memcache_backend.py and suggested a refactoring to improve maintainability. Second, I've proposed a more concise and idiomatic way to calculate need_to_allocate in pool_scheduler.py. Both are high-severity suggestions aimed at improving code quality and correctness.

gemini-code-assist · 2026-01-13T06:44:41Z

+            soc_version = get_ascend_device_type()
+            if soc_version in {AscendDeviceType.A2}:
+                import torch
+                from vllm.distributed import get_world_group
+                tmp_tensor = torch.zeros(1, device="npu")
+                output_tensor_list = [
+                    torch.empty_like(tmp_tensor)
+                    for _ in range(torch.distributed.get_world_size())
+                ]
+                torch.distributed.all_gather(
+                    output_tensor_list,
+                    tmp_tensor,
+                    group=get_world_group().device_group)
+                self.rank = parallel_config.rank
+                self.store = DistributedObjectStore()
+                res = self.store.init(self.rank)
+                assert res == 0
+            else:
+                self.rank = parallel_config.rank
+                self.store = DistributedObjectStore()
+                res = self.store.init(self.rank)
+                assert res == 0


There is significant code duplication between the if and else blocks. The initialization of self.rank and self.store is identical in both branches. This can be refactored by moving the common code outside the conditional block to improve maintainability and reduce redundancy.

soc_version = get_ascend_device_type() if soc_version in {AscendDeviceType.A2}: import torch from vllm.distributed import get_world_group tmp_tensor = torch.zeros(1, device="npu") output_tensor_list = [ torch.empty_like(tmp_tensor) for _ in range(torch.distributed.get_world_size()) ] torch.distributed.all_gather( output_tensor_list, tmp_tensor, group=get_world_group().device_group) self.rank = parallel_config.rank self.store = DistributedObjectStore() res = self.store.init(self.rank) assert res == 0

gemini-code-assist · 2026-01-13T06:44:41Z

+        if num_external_hit_tokens < num_computed_tokens:
+            need_to_allocate = 0
+        else:
+            need_to_allocate = num_external_hit_tokens - num_computed_tokens


The logic to ensure need_to_allocate is not negative can be expressed more concisely using max(0, ...). This improves readability and is a common Python idiom for this pattern.

need_to_allocate = max(0, num_external_hit_tokens - num_computed_tokens)

Synchronize memcache adaptation on A2 to the 0.13.0 branch

d2d7548

Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>

DreamerLeader changed the title ~~Synchronize memcache adaptation on A2 to the 0.13.0 branch~~ [bugfix]Synchronize memcache adaptation on A2 to the 0.13.0 branch Jan 13, 2026

gemini-code-assist bot reviewed Jan 13, 2026

View reviewed changes

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Jan 13, 2026

wangxiyuan changed the title ~~[bugfix]Synchronize memcache adaptation on A2 to the 0.13.0 branch~~ [0.13.0][cherry-pick][bugfix]Synchronize memcache adaptation on A2 Jan 14, 2026

wangxiyuan merged commit 1d4aaab into vllm-project:releases/v0.13.0 Jan 14, 2026
17 checks passed

DreamerLeader deleted the v0.13.0 branch March 14, 2026 07:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[0.13.0][cherry-pick][bugfix]Synchronize memcache adaptation on A2#5842

[0.13.0][cherry-pick][bugfix]Synchronize memcache adaptation on A2#5842
wangxiyuan merged 1 commit intovllm-project:releases/v0.13.0from
DreamerLeader:v0.13.0

DreamerLeader commented Jan 13, 2026 •

edited by wangxiyuan

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

gemini-code-assist bot Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DreamerLeader commented Jan 13, 2026 • edited by wangxiyuan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DreamerLeader commented Jan 13, 2026 •

edited by wangxiyuan

Loading