Revert "[Model] Deprecate the score task (this will not affect users)." (#37537)#37726
Revert "[Model] Deprecate the score task (this will not affect users)." (#37537)#37726zhewenl wants to merge 1 commit intovllm-project:mainfrom
Conversation
vllm-project#37537)" This reverts commit ed359c4.
|
Documentation preview: https://vllm--37726.org.readthedocs.build/en/37726/ |
There was a problem hiding this comment.
Code Review
This pull request reverts the deprecation of the score task, which was causing a CI failure. The changes correctly restore the score task and its related logic across documentation, tests, and core components. The refactoring done as part of this revert, such as renaming variables for clarity in pooling heads, is a good improvement. I've found one issue where the sagemaker router is missing support for the token_embed task for scoring, which I've commented on.
| (RerankRequest, (rerank, do_rerank)), | ||
| ] | ||
|
|
||
| if "score" in supported_tasks or "embed" in supported_tasks: |
There was a problem hiding this comment.
The condition to enable the ScoreRequest endpoint is missing the token_embed task. The score API supports late-interaction models which use the token_embed task. This should be included to ensure full functionality on the Sagemaker endpoint, consistent with other entrypoints.
| if "score" in supported_tasks or "embed" in supported_tasks: | |
| if "score" in supported_tasks or "embed" in supported_tasks or "token_embed" in supported_tasks: |
|
We need this PR to unblock v2 runner. I will investigate this CI failure and fix it ASAP, please do not revert it. |
|
I suspect it was caused by #37613, because the test below passed successfully. Let's rerun this test. All these tests passed. So it might be caused by other PR. |
|
Close this PR as this issue should be fixed by #37775. |
Revert of #37537
This reverts #37537 (merge commit ed359c4).
Reason: This PR is linked to 1 new CI failure in build #57332:
nvidia/llama-nemotron-rerank-1b-v2rerank MTEB score dropped marginally (diff=0.0023 vs atol=0.002), causingtest_rerank_models_mteb[model_info0]to fail.The PR changed pooler heads, activations, and scoring-related code which directly affects the reranking pipeline.
Auto-generated by CI failure analyzer.