[PluggableLayer][1/N] Define PluggableLayer#32331
[PluggableLayer][1/N] Define PluggableLayer#32331ProExpertProg merged 7 commits intovllm-project:mainfrom
Conversation
d34f7d0 to
85e5ea0
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a new CustomLayer abstraction to handle out-of-tree layer replacements, refactoring this logic out of CustomOp. Existing layers like FusedMoE and MultiHeadLatentAttentionWrapper are updated to use CustomLayer. My review focuses on a critical syntax error and improving the clarity of the new CustomLayer implementation.
85e5ea0 to
1505e56
Compare
ProExpertProg
left a comment
There was a problem hiding this comment.
Are we really not going to break OOT overriding of custom ops by removing register_oot from CustomOp?
|
Let's port the |
|
Currently this PR indeed breaks the original |
1505e56 to
0584726
Compare
I will migrate |
d43fc9b to
2ede0ea
Compare
|
Documentation preview: https://vllm--32331.org.readthedocs.build/en/32331/ |
|
Hi @whx-sjtu, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
ProExpertProg
left a comment
There was a problem hiding this comment.
Looks good to me! Let's see that tests pass, and then could you make a tracking issue with the list of all custom ops that should be ported over (the ones that only have 1 in-tree implementation)? Those with multiple in-tree implementations will be ported over later as we transition them to vLLM IR.
vllm/model_executor/custom_op.py
Outdated
| # - MyOp.enabled() | ||
| # - op_registry["my_op"].enabled() | ||
| op_registry: dict[str, type["CustomOp"]] = {} | ||
| op_registry_oot: dict[str, type["CustomOp"]] = {} |
There was a problem hiding this comment.
nit: type should be CustomOp or PluggableLayer
++, it'll make the workflow more clear |
Sure |
e92ce98 to
6db7bb8
Compare
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
c84f270 to
d22bc77
Compare
|
Hi @whx-sjtu, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
The tracker list is here: #32676. Note that for some custom ops like |
Signed-off-by: whx-sjtu <2952154980@qq.com>
983442f to
c6cdb55
Compare
This reverts commit 4ca62a0.
Signed-off-by: whx-sjtu <2952154980@qq.com>
|
|
Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: mohammad najafi <mohammad.najafi@amd.com>
Signed-off-by: whx-sjtu <2952154980@qq.com>
EDIT: Reverted in #32725, reapplied in #32744
Purpose
First implementation of RFC #23786: Define PluggableLayer and apply to MLA as an example.
Test Plan
New ut waitted to be added.
Test Result
All ci should pass.
CC List
@ProExpertProg @wangxiyuan @jgong5 @Yikun
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.