[Docs][Model] Support Qwen3-VL-Embedding & Qwen3-VL-Reranker#6034
[Docs][Model] Support Qwen3-VL-Embedding & Qwen3-VL-Reranker#6034wangxiyuan merged 5 commits intovllm-project:mainfrom
Conversation
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request adds documentation for the Qwen3-VL-Embedding and Qwen3-VL-Reranker models. The changes are good and provide useful examples for users. My review includes a few suggestions to improve the correctness of code examples and fix broken links to provide a better user experience. Specifically, I've pointed out a prompt formatting issue, a placeholder path that needs clarification, a character that breaks a URL, and incorrect links to the benchmark documentation.
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
|
@wangxiyuan @Yikun This PR is ready. Could you please take a look? More and more users are asking requests for these models. |
…to FIA_rebase * 'main' of https://github.com/vllm-project/vllm-ascend: (24 commits) add dispath_ffn_combine_bf16 (vllm-project#5866) [BugFix] Fix input parameter bug of dispatch_gmm_combine_decode[RFC: issue 5476] (vllm-project#5932) [1/N][Feat] Xlite Qwen3 MoE Support (vllm-project#5951) [Bugfix] Fix setting of `speculative_config.enforce_eager` for dsv32 (vllm-project#5945) [bugfix][mm] change get_num_encoder_tokens to get_num_encoder_embeds in recompute_schedule.py (vllm-project#5132) [Bugfix] fix pcp qwen full graph FIA bug (vllm-project#6037) [Bugfix]Fixed precision issues caused by pooled request pooling (vllm-project#6049) 【main】【bugfix】Resolved memory deallocation failure in the pooling layer under re-computation workloads. (vllm-project#6045) [main][Bugfix] Fixed an problem related to embeddings sharing (vllm-project#5967) [Feature]refactor the npugraph_ex config, support online-infer with static kernel (vllm-project#5775) [CI][Lint] Show lint diff on failure (vllm-project#5956) [CI] Add wait logic for each individual case (vllm-project#6036) [CI] Add DeepSeek-V3.2-W8A8 nightly ci test (vllm-project#4633) model runner v2 support triton of penalty (vllm-project#5854) [Docs][Model] Support Qwen3-VL-Embedding & Qwen3-VL-Reranker (vllm-project#6034) [Tests] move qwen3 performance test from nightly to e2e (vllm-project#5980) [Bugfix] fix bug of pcp+mtp+async scheduler (vllm-project#5994) [Main2Main] Upgrade vllm commit to releases/v0.14.0 (vllm-project#5988) [Ops] Add layernorm for qwen3Next (vllm-project#5765) [Doc] Add layer_sharding additional config for DeepSeek-V3.2-W8A8 (vllm-project#5921) ...
…oject#6034) ### What this PR does / why we need it? Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com>
…oject#6034) ### What this PR does / why we need it? Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…oject#6034) ### What this PR does / why we need it? Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com>
…oject#6034) ### What this PR does / why we need it? Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com> Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
…oject#6034) ### What this PR does / why we need it? Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker. - vLLM version: v0.13.0 - vLLM main: vllm-project/vllm@2c24bc6 --------- Signed-off-by: gcanlin <canlinguosdu@gmail.com>
What this PR does / why we need it?
Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker.
Does this PR introduce any user-facing change?
How was this patch tested?