[main][refactor] Refactoring forward_context and model_runner_v1 by zzzzwwjj · Pull Request #1979 · vllm-project/vllm-ascend

zzzzwwjj · 2025-07-24T03:25:11Z

What this PR does / why we need it?

A refactoring of forward_context and model_runner_v1, add some context which is necessary in model inference into forward_context, and refactor dummy_run logic, make it more reasonable.
Some details for this PR:

Add ascend_forward_context;
Update mc2_v2 op, and support active_mask param;
Update scripts in examples dir;
refactor dummy_run logic;
Add soc_version for A2 and A3;

Does this PR introduce any user-facing change?

No change at user-facing.

How was this patch tested?

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@57c22e5

ApsarasX · 2025-07-24T06:26:06Z

PR too large, split into several small PRs?

zzzzwwjj · 2025-07-24T09:17:03Z

PR too large, split into several small PRs?

A little difficult😂

github-actions · 2025-07-24T11:38:39Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: zzzzwwjj <1183291235@qq.com>

codecov · 2025-07-25T03:57:15Z

Codecov Report

❌ Patch coverage is 55.34884% with 96 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.11%. Comparing base (df0ec55) to head (fb450e2).
⚠️ Report is 620 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ascend_forward_context.py	45.61%	31 Missing ⚠️
vllm_ascend/quantization/w8a8_dynamic.py	14.70%	29 Missing ⚠️
vllm_ascend/distributed/parallel_state.py	45.83%	13 Missing ⚠️
vllm_ascend/ops/fused_moe.py	80.76%	10 Missing ⚠️
vllm_ascend/utils.py	46.66%	8 Missing ⚠️
vllm_ascend/models/deepseek_dbo.py	0.00%	3 Missing ⚠️
tests/ut/ops/test_fused_ops.py	87.50%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1979      +/-   ##
==========================================
- Coverage   71.73%   71.11%   -0.62%     
==========================================
  Files          96       98       +2     
  Lines       10719    10857     +138     
==========================================
+ Hits         7689     7721      +32     
- Misses       3030     3136     +106

Flag	Coverage Δ
unittests	`71.11% <55.34%> (-0.62%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

MengqingCao · 2025-07-25T06:05:44Z

examples/data_parallel.py

We already have a dp example now: https://github.com/vllm-project/vllm-ascend/blob/main/examples/offline_data_parallel.py

vllm_ascend/worker/worker_v1.py

wangxiyuan · 2025-07-25T08:49:51Z

vllm_ascend/ascend_forward_context.py

The ut for ascend_forward_context and parallel_state is missing.

wangxiyuan · 2025-07-25T08:51:25Z

vllm_ascend/attention/attention_v1_torchair.py

num_input_tokens is uselss.

wangxiyuan · 2025-07-25T08:55:07Z

vllm_ascend/utils.py

what is AscendSocVersion.MAX? how it will be used?

github-actions · 2025-07-26T00:22:33Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan · 2025-07-27T00:54:26Z

vllm_ascend/ascend_forward_context.py

+from vllm.config import VllmConfig
+from vllm.distributed import get_dp_group, get_ep_group, get_tp_group
+from vllm.forward_context import get_forward_context, set_forward_context
+from vllm.platforms import current_platform


avoid to use current_platform in vllm-ascend. It'll lead circle import in some case. Use NPUPlatform directly

wangxiyuan · 2025-07-27T00:57:38Z

vllm_ascend/quantization/quantizer.py


 from vllm.logger import logger

-from .func_wrapper import (wrapper_load_model, wrapper_rmsnorm_forward_oot,


wrapper_load_model function can be removed as well

wangxiyuan · 2025-07-27T01:01:36Z

vllm_ascend/worker/worker_v1.py

-        self.model_runner._dummy_run(max_num_tokens,
-                                     is_compile=False,
-                                     with_prefill=with_prefill)
+        self.model_runner._dummy_run(1)


@ApsarasX @jianzs please double check this change. Thanks

Yikun · 2025-07-28T01:41:00Z

Please fullfill commits msg and plus how to test

zzzzwwjj · 2025-07-28T03:40:13Z

Add ut test issue: #2056

zzzzwwjj · 2025-07-28T04:23:19Z

Please fullfill commits msg and plus how to test

done.

…m-project#1979) A refactoring of forward_context and model_runner_v1, add some context which is necessary in model inference into forward_context, and refactor dummy_run logic, make it more reasonable. Some details for this PR: Add `ascend_forward_context`; Update mc2_v2 op, and support `active_mask` param; Update scripts in examples dir; refactor `dummy_run` logic; Add soc_version for A2 and A3; No change at user-facing. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@57c22e5 Signed-off-by: zzzzwwjj <1183291235@qq.com>

…m-project#1979) ### What this PR does / why we need it? A refactoring of forward_context and model_runner_v1, add some context which is necessary in model inference into forward_context, and refactor dummy_run logic, make it more reasonable. Some details for this PR: Add `ascend_forward_context`; Update mc2_v2 op, and support `active_mask` param; Update scripts in examples dir; refactor `dummy_run` logic; Add soc_version for A2 and A3; ### Does this PR introduce _any_ user-facing change? No change at user-facing. ### How was this patch tested? - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@57c22e5 Signed-off-by: zzzzwwjj <1183291235@qq.com>

@zzzzwwjj

I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include #3232 (comment), #4822 (comment), #4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: #1229 #1979 #4359 #4878 - Community Involvement‌: He lead the #1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include #4868 (comment), #2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - #3334 - #3420 - #3015 co-author: - #3495 - #4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](#2867) and [rejection sampler](#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](#4345 (comment)), [issuecomment-3540994801](#4161 (comment)), and [discussion_r2492593988](#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer #1568 #2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack #2913 #3350 - Quality Contribution‌: #1568 #2602 #2913 #3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

@zzzzwwjj

…t#5152) I'd like to nominate @zzzzwwjj @realliujiaxu @LCAIZJ to join vLLM Ascend committer team. @zzzzwwjj --- - Review Quality‌: He has completed 80+reviews since April. 2025, include vllm-project#3232 (comment), vllm-project#4822 (comment), vllm-project#4768 (comment) high quality review. - Sustained Contributions 15+ Valuable bug fix and refactor is very good. https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Azzzzwwjj+is%3Aclosed+review%3Aapproved Continuous optimization of code architecture https://github.com/vllm-project/vllm-ascend/pulls?q=author%3Azzzzwwjj+is%3Amerged - Quality Contribution‌: vllm-project#1229 vllm-project#1979 vllm-project#4359 vllm-project#4878 - Community Involvement‌: He lead the vllm-project#1147, to refactor AscendFusedMoE at the first time. He shared topics about large-scale distributed inference and reinforcement learning on vLLM-Ascend meetup on August 2nd. @realliujiaxu --- - Review Quality‌: He has completed about [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Arealliujiaxu+-author%3Arealliujiaxu+) since September, include vllm-project#4868 (comment), vllm-project#2275 (comment). - Sustained Contributions He has completed (17 commits)[https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged], continuously optimizing the performance of the MoE model. - Quality Contribution‌: Contributed the Flash Comm1 feature to the community, supporting both eager and aclgraph execution modes, while compatible with multiple MoE models including DeepSeek and GLM4.5. - vllm-project#3334 - vllm-project#3420 - vllm-project#3015 co-author: - vllm-project#3495 - vllm-project#4868 - Community Involvement‌: 1. Completed two major refactors, enabling vllm-ascend to evolve more rapidly and robustly: [Linear module](vllm-project#2867) and [rejection sampler](vllm-project#4975) 2. [fixed 8 bugs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Arealliujiaxu+is%3Amerged+bugfix+) in graph mode, spec decoding and async scheduling. @LCAIZJ --- - Review Quality‌: He's been the go-to reviewer for virtually all PD disaggregation and KV Pool related PRs, having completed [30+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3ALCAIZJ+is%3Aopen+-author%3ALCAIZJ+) since May 2025. Notable examples include [discussion_r2553887360](vllm-project#4345 (comment)), [issuecomment-3540994801](vllm-project#4161 (comment)), and [discussion_r2492593988](vllm-project#3981 (comment)), all demonstrating thorough and insightful feedback. - Sustained and Quality Contributions: His contributions reflect a strong grasp of both ‌vLLM‌ and ‌vLLM Ascend‌ codebases, particularly in prefill-decode disaggregation and KV pool areas ([7 PRs merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+)). Prefill-Decode Disaggregation: Delivered KV transfer functionality using Mooncake TransferEngine and enabled layerwise KV transfer vllm-project#1568 vllm-project#2602 KV Pool: Developed the foundational KV Pool infrastructure and migrated it to the latest ADXL stack vllm-project#2913 vllm-project#3350 - Quality Contribution‌: vllm-project#1568 vllm-project#2602 vllm-project#2913 vllm-project#3350 - Community Involvement‌: He actively responds to [community issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3ALCAIZJ%20is%3Aopen%20-author%3ALCAIZJ), continuously monitors functionality and accuracy issues related to PD disaggregation and KV Pool, and proactively delivers [bug fixes](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3ALCAIZJ+is%3Amerged+bugfix). - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>

github-actions bot added module:ops module:core module:quantization labels Jul 24, 2025

zzzzwwjj force-pushed the main branch from 9a0f883 to 89abf95 Compare July 24, 2025 06:23

zzzzwwjj force-pushed the main branch 4 times, most recently from fdc2db9 to 8b4c2a2 Compare July 24, 2025 09:04

github-actions bot added the module:tests label Jul 24, 2025

zzzzwwjj force-pushed the main branch from 8b4c2a2 to 039de30 Compare July 24, 2025 09:18

github-actions bot added the merge-conflicts label Jul 24, 2025

[main][refactor] Refactoring forward_context and model_runner_v1

e82e946

Signed-off-by: zzzzwwjj <1183291235@qq.com>

zzzzwwjj force-pushed the main branch from 039de30 to e82e946 Compare July 24, 2025 14:51

github-actions bot removed the merge-conflicts label Jul 25, 2025

zzzzwwjj force-pushed the main branch 3 times, most recently from c62fd91 to 3b7424e Compare July 25, 2025 03:37

MengqingCao reviewed Jul 25, 2025

View reviewed changes

zzzzwwjj force-pushed the main branch from 3b7424e to ca42e65 Compare July 25, 2025 06:56

wangxiyuan reviewed Jul 25, 2025

View reviewed changes

zzzzwwjj force-pushed the main branch from ca42e65 to 28519e9 Compare July 25, 2025 13:20

wangxiyuan mentioned this pull request Jul 25, 2025

[2/N][Refactor] Refactor V1 attention for better extensibility #1995

Merged

github-actions bot added the merge-conflicts label Jul 26, 2025

Merge branch 'main' into main

a62eb0e

zzzzwwjj force-pushed the main branch from 28519e9 to a62eb0e Compare July 26, 2025 08:03

wangxiyuan reviewed Jul 27, 2025

View reviewed changes

github-actions bot removed the merge-conflicts label Jul 27, 2025

Merge branch 'main' into ascend_forward_context_refactor

fb450e2

zzzzwwjj force-pushed the ascend_forward_context_refactor branch from 18c31f8 to fb450e2 Compare July 27, 2025 14:18

ganyi1996ppo approved these changes Jul 28, 2025

View reviewed changes

wangxiyuan merged commit ba3dfbd into vllm-project:main Jul 28, 2025
25 checks passed

wangxiyuan mentioned this pull request Dec 18, 2025

Nominate new maintainers @zzzzwwjj @realliujiaxu @LCAIZJ #5152

Merged


		from vllm.logger import logger

		from .func_wrapper import (wrapper_load_model, wrapper_rmsnorm_forward_oot,

Conversation

zzzzwwjj commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ApsarasX commented Jul 24, 2025

Uh oh!

zzzzwwjj commented Jul 24, 2025

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

codecov bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MengqingCao Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wangxiyuan Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jul 26, 2025

Uh oh!

wangxiyuan Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

Yikun commented Jul 28, 2025

Uh oh!

zzzzwwjj commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zzzzwwjj commented Jul 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zzzzwwjj commented Jul 24, 2025 •

edited

Loading

codecov bot commented Jul 25, 2025 •

edited

Loading

zzzzwwjj commented Jul 28, 2025 •

edited

Loading