Fix scheduler yield on arm#30228
Conversation
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request addresses a critical issue on ARM systems where os.sched_yield fails to relinquish the Global Interpreter Lock (GIL), leading to CPU-bound performance problems. The change correctly modifies the USE_SCHED_YIELD logic to fall back to time.sleep(0) on ARM architectures, ensuring proper GIL release and improving system responsiveness. The addition of the CpuArchEnum and Platform imports is appropriate for this detection. The code is clear and directly resolves the described problem.
|
Does this fix: #29369? |
|
@tlrmchlsmth can you check this on gb200? |
|
I or someone on my team will look into this, but I'm not sure what we should look out for. What should we expect to see if the |
|
@heheda12345 @robertgshaw2-redhat @tlrmchlsmth Sorry for late reply. When running vLLM with Reproduce command: I think this can be reproduced on GH200 as well. |
|
@tlrmchlsmth would you mind take a look at this one? Thanks. |



Purpose
For Arm systems, os.sched_yield does not take effect, causing the GIL (Global Interpreter Lock) to remain unrelinquished and resulting in CPU bound issues. we should making the process execute time.sleep(0) instead to release the GIL.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Note
Ensures polling yields the GIL on ARM.
vllm/distributed/utils.py:USE_SCHED_YIELDnow also checksPlatform.get_cpu_architecture()and disablesos.sched_yieldonARM, falling back totime.sleep(0)CpuArchEnumandPlatform; update comments accordinglyWritten by Cursor Bugbot for commit 0274e03. This will update automatically on new commits. Configure here.