[BugFix] Add sleep to fix tight loop and release GIL#29476
[BugFix] Add sleep to fix tight loop and release GIL#29476njhill merged 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: alec-flowers <aflowers@nvidia.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request introduces a time.sleep(0.001) to address a potential tight loop in the engine's busy-wait cycle. This occurs when there are pending requests (e.g., waiting for remote KV cache) but no model execution can be scheduled. By yielding the Global Interpreter Lock (GIL), this change prevents the starvation of background threads, which is critical for operations like distributed handshakes. The fix is well-targeted, conditional, and a pragmatic solution to a potential deadlock or performance degradation issue.
njhill
left a comment
There was a problem hiding this comment.
Thanks @alec-flowers. I think this might actually be the best fix given the current design.
Ideally we should avoid a hot loop altogether in this scenario but that would require more significant rework.
|
I have a feeling this is an ARM vs x86 issue: #30228 |
The sched_yield thing only applies to multiproc executor case where we spin on the shm queue, this issue is kind of separate to that and would apply in uniproc case too. |
Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: Alec <35311602+alec-flowers@users.noreply.github.com>
- Remove UniProcExecutor GIL contention workaround (fixed upstream in vllm-project/vllm#29476) - Fix _uses_nixl_connector() and _uses_dynamo_connector() to detect connectors nested inside PdConnector's kv_connector_extra_config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: alec-flowers <aflowers@nvidia.com>
- Remove UniProcExecutor GIL contention workaround (fixed upstream in vllm-project/vllm#29476) - Fix _uses_nixl_connector() and _uses_dynamo_connector() to detect connectors nested inside PdConnector's kv_connector_extra_config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: alec-flowers <aflowers@nvidia.com>
Potential Fix for - #29369
While not very elegant it does do the job of releasing the GIL.
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.