Skip to content

[rollout] fix: correct heap-based load balancing in AsyncLLMServerManager#4505

Merged
wuxibin89 merged 1 commit intoverl-project:mainfrom
hellcatCS:fix_server_shuffling
Dec 15, 2025
Merged

[rollout] fix: correct heap-based load balancing in AsyncLLMServerManager#4505
wuxibin89 merged 1 commit intoverl-project:mainfrom
hellcatCS:fix_server_shuffling

Conversation

@hellcatCS
Copy link
Contributor

What does this PR do?

This PR fixes the load balancing issue in AsyncLLMServerManager where the heap-based server selection was using hash values instead of indices, causing unpredictable server selection order after shuffling.

Problem:

The original implementation used hash(server) as the secondary sort key in the heap
When all servers had the same request count (0), the heap would select the server with the minimum hash value, not the first server in the shuffled list
This resulted in poor load distribution and defeated the purpose of random shuffling

Solution:

Replace hash(server) with explicit indices in the heap structure
Heap now sorts by (request_count, index, server) instead of (request_count, hash, server)
Ensures deterministic selection: when request counts are equal, the server with the lowest index (first in the shuffled list) is always chosen

Example:

Before (❌ Broken):

server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles) # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, hash(s), s] for s in server_handles]

Heap might select Server_A first (min hash), not Server_C!

After (✅ Fixed):

server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles) # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, idx, s] for idx, s in enumerate(server_handles)]

Heap correctly selects Server_C first (idx=0)

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a load balancing issue in AsyncLLMServerManager by replacing the unpredictable hash(server) with a deterministic index as a tie-breaker in the server selection heap. The change ensures that servers are selected in a predictable, shuffled order when their request counts are equal. My review includes a suggestion to refactor the heap update logic in _choose_server to be more idiomatic and readable, improving maintainability.

Comment on lines +84 to 87
_, _, server = self.weighted_serveres[0]
self.weighted_serveres[0][0] += 1
heapq.heapreplace(self.weighted_serveres, self.weighted_serveres[0])
self.request_id_to_server[request_id] = server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation for updating the heap is a bit obscure. It modifies the top element of the heap in-place, which violates the heap invariant, and then uses heapq.heapreplace to restore it. While this works, it's not very readable and relies on implementation details of heapq.heapreplace.

A more idiomatic and clearer approach would be to use heapq.heappop followed by heapq.heappush. This makes the intent of the code much more explicit and improves maintainability.

Suggested change
_, _, server = self.weighted_serveres[0]
self.weighted_serveres[0][0] += 1
heapq.heapreplace(self.weighted_serveres, self.weighted_serveres[0])
self.request_id_to_server[request_id] = server
count, idx, server = heapq.heappop(self.weighted_serveres)
heapq.heappush(self.weighted_serveres, [count + 1, idx, server])
self.request_id_to_server[request_id] = server

@hellcatCS
Copy link
Contributor Author

hellcatCS commented Dec 13, 2025

@vermouth1992 @wuxibin89 @PeterSH6 @chenhaiq @zhaochenyang20 @SwordFaith please review the request

@wuxibin89 wuxibin89 merged commit 36c2d4a into verl-project:main Dec 15, 2025
74 of 77 checks passed
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
…ager (verl-project#4505)

### What does this PR do?

This PR fixes the load balancing issue in AsyncLLMServerManager where
the heap-based server selection was using hash values instead of
indices, causing unpredictable server selection order after shuffling.

Problem:

The original implementation used hash(server) as the secondary sort key
in the heap
When all servers had the same request count (0), the heap would select
the server with the minimum hash value, not the first server in the
shuffled list
This resulted in poor load distribution and defeated the purpose of
random shuffling

Solution:

Replace hash(server) with explicit indices in the heap structure
Heap now sorts by (request_count, index, server) instead of
(request_count, hash, server)
Ensures deterministic selection: when request counts are equal, the
server with the lowest index (first in the shuffled list) is always
chosen

Example: 

# Before (❌ Broken):
server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles)  # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, hash(s), s] for s in server_handles]
# Heap might select Server_A first (min hash), not Server_C!

# After (✅ Fixed):
server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles)  # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, idx, s] for idx, s in enumerate(server_handles)]
# Heap correctly selects Server_C first (idx=0)

Co-authored-by: Aleksandr Semikin <aesemikin@alice-a100.sas.yp-c.yandex.net>
sophiayyya pushed a commit to sophiayyya/verl that referenced this pull request Jan 25, 2026
…ager (verl-project#4505)

### What does this PR do?

This PR fixes the load balancing issue in AsyncLLMServerManager where
the heap-based server selection was using hash values instead of
indices, causing unpredictable server selection order after shuffling.

Problem:

The original implementation used hash(server) as the secondary sort key
in the heap
When all servers had the same request count (0), the heap would select
the server with the minimum hash value, not the first server in the
shuffled list
This resulted in poor load distribution and defeated the purpose of
random shuffling

Solution:

Replace hash(server) with explicit indices in the heap structure
Heap now sorts by (request_count, index, server) instead of
(request_count, hash, server)
Ensures deterministic selection: when request counts are equal, the
server with the lowest index (first in the shuffled list) is always
chosen

Example: 

# Before (❌ Broken):
server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles)  # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, hash(s), s] for s in server_handles]
# Heap might select Server_A first (min hash), not Server_C!

# After (✅ Fixed):
server_handles = [Server_A, Server_B, Server_C]
random.shuffle(server_handles)  # → [Server_C, Server_A, Server_B]
weighted_servers = [[0, idx, s] for idx, s in enumerate(server_handles)]
# Heap correctly selects Server_C first (idx=0)

Co-authored-by: Aleksandr Semikin <aesemikin@alice-a100.sas.yp-c.yandex.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants