[bugfix] A2 Environment Pooling for Memcache Compatibility by DreamerLeader · Pull Request #5601 · vllm-project/vllm-ascend

DreamerLeader · 2026-01-05T03:52:10Z

What this PR does / why we need it?

When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.13.0
vLLM main: vllm-project/vllm@7157596

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

gemini-code-assist

Code Review

This pull request introduces a bugfix for Memcache compatibility on A2 environments by adding device-specific logic. My review identifies two key areas for improvement: refactoring duplicated code to enhance maintainability and strengthening error handling to prevent masking potential bugs. These changes will make the code cleaner and more robust.

gemini-code-assist · 2026-01-05T03:53:23Z

    def register_buffer(self, ptrs: list[int], sizes: list[int]):
-        pass
+        try:
+            for ptr, size in zip(ptrs, sizes):
+                self.store.register_buffer(ptr, size)
+        except Exception as exc:
+            logger.info(
+                "A3 environment does not require cache registration.: %s", exc)


The try...except Exception block is too broad and can mask unexpected errors. The log message suggests this is intended for a specific case on the "A3 environment", but it will catch any exception on any environment and log it as an info message. This could hide critical bugs during buffer registration on non-A3 environments, leading to a broken state without a clear failure signal.

To make the error handling more robust, you should check the device type and only suppress the exception if it's running on an A3 device. For all other devices, any exception should be logged as an error and re-raised.

Additionally, the parameter sizes in this method's signature overrides lengths from the base Backend class. It's good practice to maintain consistency with the base class signature by using lengths.

def register_buffer(self, ptrs: list[int], lengths: list[int]): soc_version = get_ascend_device_type() try: for ptr, length in zip(ptrs, lengths): self.store.register_buffer(ptr, length) except Exception as exc: if soc_version in {AscendDeviceType.A3}: logger.info( "A3 environment does not require cache registration; " "suppressing exception: %s", exc) else: logger.error("An unexpected error occurred during buffer registration: %s", exc) raise

gemini-code-assist · 2026-01-05T03:53:23Z

+            soc_version = get_ascend_device_type()
+            if soc_version in {AscendDeviceType.A2}:
+                import torch
+                from vllm.distributed import get_world_group
+                tmp_tensor = torch.zeros(1, device="npu")
+                output_tensor_list = [
+                    torch.empty_like(tmp_tensor)
+                    for _ in range(torch.distributed.get_world_size())
+                ]
+                torch.distributed.all_gather(
+                    output_tensor_list,
+                    tmp_tensor,
+                    group=get_world_group().device_group)
+                self.rank = parallel_config.rank
+                self.store = DistributedObjectStore()
+                res = self.store.init(self.rank)
+                assert res == 0
+            else:
+                self.rank = parallel_config.rank
+                self.store = DistributedObjectStore()
+                res = self.store.init(self.rank)
+                assert res == 0


There is significant code duplication between the if and else blocks. The lines for initializing self.rank and self.store are identical in both branches. This can be refactored to reduce redundancy and improve maintainability, which is crucial for preventing bugs where a fix is applied to one copy but not the other.

Additionally, torch is already imported at the top of the file, so the local import torch inside the if block is redundant and should be removed.

soc_version = get_ascend_device_type() if soc_version in {AscendDeviceType.A2}: from vllm.distributed import get_world_group tmp_tensor = torch.zeros(1, device="npu") output_tensor_list = [ torch.empty_like(tmp_tensor) for _ in range(torch.distributed.get_world_size()) ] torch.distributed.all_gather( output_tensor_list, tmp_tensor, group=get_world_group().device_group) self.rank = parallel_config.rank self.store = DistributedObjectStore() res = self.store.init(self.rank) assert res == 0

github-actions · 2026-01-05T07:17:38Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: [CI] Unblock 4-cards test (vllm-project#5831) [Refactor] Provide a framework to accommodate operators for different hardware devices (vllm-project#5735) [Refactor] Modify the binding logic to allocate CPU cores for each NPU card (vllm-project#5555) [BugFix] Support setting tp=1 for the Eagle draft model to take effect (vllm-project#5519) support triton of mrope (vllm-project#5664) [bugfix] A2 Environment Pooling for Memcache Compatibility (vllm-project#5601) [Doc] Update community contributors and versioning naming to follow vLLM (vllm-project#5820) [Refactor] Add comments for Metadata classes in attention module (vllm-project#5789) [Bugfix] bugfix for the order of dummy run pad and sync (vllm-project#5777) [CI] Move nightly-a2 test to hk (vllm-project#5807) [CI] Show disk usage for CI shared volume (vllm-project#5821) Bump actions/checkout from 4 to 6 (vllm-project#5795) Bump actions/github-script from 7 to 8 (vllm-project#5796) [bugfix](cp) align max_context_chunk to cp_virtual_block_size (vllm-project#5767) [bugfix]limit graph replay sync (vllm-project#5761) [CI]Add Kimi k2 nightly test (vllm-project#5682) [Doc] add tls check to pd disaggregation readme (vllm-project#5638) [CI] adpat v0.13.0 change (vllm-project#5793)

…ect#5601) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 --------- Signed-off-by: fangjianwei <f30058701@china.huawei.com> Co-authored-by: fangjianwei <f30058701@china.huawei.com>

…5842) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. pick-from: #5601 Signed-off-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local> Co-authored-by: 房建伟 <fangjianwei@fangjianweideMacBook-Air.local>

…ect#5601) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 --------- Signed-off-by: fangjianwei <f30058701@china.huawei.com> Co-authored-by: fangjianwei <f30058701@china.huawei.com>

…ect#5601) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 --------- Signed-off-by: fangjianwei <f30058701@china.huawei.com> Co-authored-by: fangjianwei <f30058701@china.huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…ect#5601) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 --------- Signed-off-by: fangjianwei <f30058701@china.huawei.com> Co-authored-by: fangjianwei <f30058701@china.huawei.com>

…ect#5601) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 --------- Signed-off-by: fangjianwei <f30058701@china.huawei.com> Co-authored-by: fangjianwei <f30058701@china.huawei.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

…ect#5601) ### What this PR does / why we need it? When running memcache in the A2 environment, the logic for registering memory needs to be added. Additionally, there is a link establishment conflict between memcache and HCCS during initialization in A2, so the link should be established in advance. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@7157596 --------- Signed-off-by: fangjianwei <f30058701@china.huawei.com> Co-authored-by: fangjianwei <f30058701@china.huawei.com>

fangjianwei added 2 commits January 5, 2026 11:49

A2 Environment Pooling for Memcache Compatibility

95ac7a3

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

A2 Environment Pooling for Memcache Compatibility

c1a6754

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

gemini-code-assist bot reviewed Jan 5, 2026

View reviewed changes

fangjianwei added 3 commits January 5, 2026 16:06

A2 Environment Pooling for Memcache Compatibility

20e9652

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

A2 Environment Pooling for Memcache Compatibility

b092e13

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

A2 Environment Pooling for Memcache Compatibility

0761013

Signed-off-by: fangjianwei <f30058701@china.huawei.com>

LCAIZJ added the ready-for-test start test by label for PR label Jan 8, 2026

wangxiyuan added the ready read for review label Jan 8, 2026

DreamerLeader added 4 commits January 9, 2026 09:48

Merge branch 'vllm-project:main' into memcache_A2

29b55e4

Merge branch 'vllm-project:main' into memcache_A2

cf274dd

Merge branch 'vllm-project:main' into memcache_A2

05bcb86

Merge branch 'vllm-project:main' into memcache_A2

3efeb1e

wangxiyuan merged commit db7cf9b into vllm-project:main Jan 13, 2026
20 of 23 checks passed

DreamerLeader mentioned this pull request Jan 13, 2026

[0.13.0][cherry-pick][bugfix]Synchronize memcache adaptation on A2 #5842

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bugfix] A2 Environment Pooling for Memcache Compatibility#5601

[bugfix] A2 Environment Pooling for Memcache Compatibility#5601
wangxiyuan merged 9 commits intovllm-project:mainfrom
DreamerLeader:memcache_A2

DreamerLeader commented Jan 5, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

gemini-code-assist bot Jan 5, 2026

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

DreamerLeader commented Jan 5, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DreamerLeader commented Jan 5, 2026 •

edited by github-actions bot

Loading