Skip to content

Opensource Ascend NPU Plugin for LMCache#1

Merged
matthewygf merged 2 commits intoLMCache:mainfrom
matthewygf:main
Aug 18, 2025
Merged

Opensource Ascend NPU Plugin for LMCache#1
matthewygf merged 2 commits intoLMCache:mainfrom
matthewygf:main

Conversation

@matthewygf
Copy link
Copy Markdown
Collaborator

@matthewygf matthewygf commented Aug 15, 2025

This PR adds Ascend NPU 910b series initial support to LMCache v0.3.3 :

  • support for multi_layer and single_layer transfer for vLLM connector

Changes Made

  1. Added custom CMakeBuild system support based on Ascend CANN Toolkit 8.2
  2. Kernels support for multi_layer and single_layer transfer
  3. Added a host register API in the pybind11 interface for Ascend due to current python API limitation
  4. Extend the MixedMemory and PinnedMemory Allocator for hostregister API
  5. Monkey patched vllm_adapter for init_cache_engine for config check and used the extended classes
  6. Made changes to the test_*.py that uses the necessary NPU APIs to run successfully, all unit tests passed on our platform except the skipped ones, and require non-supported kernels.
  7. We have included a dockerfile for our A2 series
  8. Introduce a Dynamic LMCacheAscendConnector that extend the LMCacheConnector for the serve entry point and patching the necessary c_ops and functions

related issue
related pr

Pre-requisite

NPU Driver version >= 24.1

CANN Tool-kit version >= 8.2

Installed compatible version of vllm-ascend, torch-npu

We currently support the following vLLM-ascend versions:

  • v0.9.2

Build Instructions

pip install --no-build-isolation -v -e .

Test Coverage

We ran the unit tests successfully for the main patched classes.

Related PR

This following PR in vllm-ascend is related to support LMCache

Roadmap

We currently only support eagermode execution. We plan to have the following features in the near future:

  • Cachegen
  • Cacheblend
  • Graph Mode with torch_npu torchair
  • P/D with NPU Transport

gfmyeung and others added 2 commits August 15, 2025 10:51
Signed-off-by: matthewygf <yyygggfff@hotmail.com>

Co-authored-by: Marco Barletta <barlettamarco8@gmail.com>

Co-authored-by: chloroethylene <jjysama@gmail.com>
Signed-off-by: matthewygf <yyygggfff@hotmail.com>
@matthewygf matthewygf assigned YaoJiayi and matthewygf and unassigned YaoJiayi Aug 15, 2025
Copy link
Copy Markdown

@YaoJiayi YaoJiayi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@matthewygf matthewygf merged commit 6e6c50b into LMCache:main Aug 18, 2025
chloroethylene pushed a commit that referenced this pull request Jan 5, 2026
* Add build workflow check

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Update Dockerfile paths in build workflow

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Enable submodules in build workflow

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Add Pre-Commit & GitHub actions (#1)

* Add Pre-commit lint & build test workflow action
---------

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Update build-and-test.yml

- because the host-ip variable cannot work here yet, so temporarily pass for the bridgeip

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Add pull-requests permission to workflow

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Remove test report publishing step

Removed the test report publishing step from the workflow.

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Add workflow to publish test results

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* enhance reports xml

* Fix formatting in build-and-test workflow

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Fix formatting in report-test-results.yml

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

* Fix error message formatting in managed_mem.cpp

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>

---------

Signed-off-by: Matthew Yeung <yyygggfff@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants