Skip to content

Commit c823c29

Browse files
ganyi1996ppoAngazenn
authored andcommitted
Implementation of simple load balance routing proxy server (vllm-project#1953) (vllm-project#2124)
### What this PR does / why we need it? The PR is the cherry-pick from v0.9.1 vllm-project#1953 This PR introduce a new load balance proxy server example implementation for disaggregated pd, which support simple token&kv_cache aware load balance routing strategy for the disaggregated pd system compared with origin round robin toy_proxy. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? tested on real workload and unittest - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ad57f23 --------- Signed-off-by: ganyi <[email protected]>
1 parent ec32b99 commit c823c29

File tree

2 files changed

+518
-275
lines changed

2 files changed

+518
-275
lines changed

0 commit comments

Comments
 (0)