Skip to content

Conversation

@whybeyoung
Copy link
Collaborator

@whybeyoung whybeyoung commented May 27, 2025

The original while True queue polling implementation caused some CPU overhead under high concurrency. To minimize CPU usage, a more efficient approach using a deque combined with a Condition mechanism was adopted.

-            while True:
-                try:
-                    kv_chunk: TransferKVChunk = self.transfer_queue.get(timeout=0.01)
-                    reqs_to_be_processed = (
-                        self.transfer_infos[kv_chunk.room].values()
-                        if kv_chunk.room in self.transfer_infos

See the commit ..

CC @ByronHsu @ShangmingCai @zhyncs @fzyzcjy

@zhyncs
Copy link
Member

zhyncs commented May 27, 2025

@whybeyoung May you help fix the conflicts? QQ can you also share some performance results?

@zhyncs zhyncs merged commit 6b23132 into sgl-project:main May 28, 2025
30 of 40 checks passed
ChangyiYang pushed a commit to ChangyiYang/sglang-changyi that referenced this pull request May 29, 2025
Signed-off-by: Shangming Cai <[email protected]>
Co-authored-by: Shangming Cai <[email protected]>
Layssy pushed a commit to Layssy/sglang-iaas that referenced this pull request Jun 9, 2025
Signed-off-by: Shangming Cai <[email protected]>
Co-authored-by: Shangming Cai <[email protected]>
xwu-intel pushed a commit to xwu-intel/sglang that referenced this pull request Jun 17, 2025
Signed-off-by: Shangming Cai <[email protected]>
Co-authored-by: Shangming Cai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants