Skip to content

[vllm, fully_async] fix: clamp max_tokens to response_length instead of max_model_len - prompt_len in async vLLM rollout#5505

Closed
Silas-11 wants to merge 2 commits intoverl-project:mainfrom
Silas-11:release
Closed

[vllm, fully_async] fix: clamp max_tokens to response_length instead of max_model_len - prompt_len in async vLLM rollout#5505
Silas-11 wants to merge 2 commits intoverl-project:mainfrom
Silas-11:release