[Test][e2e][LoRA] Add more e2e tests to cover scenarios of LoRA#4075
[Test][e2e][LoRA] Add more e2e tests to cover scenarios of LoRA#4075paulyu12 merged 2 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request adds end-to-end tests for LoRA scenarios. The changes are mostly good, but I've found a high-severity issue in tests/e2e/singlecard/test_qwen3_multi_loras.py related to test isolation. The test uses mutable global state, which can lead to flaky and hard-to-maintain tests. I've provided a refactoring suggestion to address this by encapsulating the state within the test function. This will make the test self-contained and robust.
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
a0b213b to
b2a1cb8
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
@paulyu12 any update? |
This CI issue was introduced by #4168 .I tried according to your instruction (Don't make LoRA scenario go into this I am still working on this. |
|
@wxsIcey please take a look |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: paulyu12 <507435917@qq.com>
142c570 to
825e2d2
Compare
Signed-off-by: paulyu12 <507435917@qq.com>
…to eplb_refactor * 'main' of https://github.com/vllm-project/vllm-ascend: [CI] Fix lint CI (vllm-project#5880) [Feature] implement eagle spec decoding for model runner v2 (vllm-project#5840) [Quantization] Support compressed tensors moe w8a8 int8 dynamic weight (vllm-project#5718) [EPLB][Bugfix] Get expert map from layers (vllm-project#5817) [Bugfix] Fixed an accuracy problem of sp with eagle3 (vllm-project#5816) [P/D] bugfix for p node force free requset (vllm-project#5431) [Lint]Style: Convert `example` to `ruff format` (vllm-project#5863) [Main2Main] Upgrade vllm commit to 0109 (vllm-project#5752) [Bugfix][P/D] fix layerwise connector for decoder tp size > num kv heads (vllm-project#5846) [Test][e2e][LoRA] Add more e2e tests to cover scenarios of LoRA (vllm-project#4075) [CustomOp][Perf] Merge Q/K split to simplify AscendApplyRotaryEmb for better performance (vllm-project#5799) [Lint]Style: Convert `root`, `benchmarks`, `tools` and `docs` to `ruff format` (vllm-project#5843) enable ep32 for dispatch_ffn_combine (vllm-project#5787)
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com>
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com>
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com>
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com>
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…-project#4075) ### What this PR does / why we need it? This PR depends on PR vllm-project#4046. And only if the latter merged, it will work. This PR aims to solve the issue vllm-project#3240. The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_llama2_lora.py pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b --------- Signed-off-by: paulyu12 <507435917@qq.com>
What this PR does / why we need it?
This PR depends on PR #4046. And only if the latter merged, it will work.
This PR aims to solve the issue #3240.
The new-added Llama-2-7b-hf and Qwen3-0.6B testcases will cover the senarios that the LoRA weights are added to q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens and lm_head modules.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
pytest -sv tests/e2e/singlecard/test_llama2_lora.py
pytest -sv tests/e2e/singlecard/test_qwen3_multi_loras.py