Skip to content

Fix UnboundLocalError when DetokenizerManager constructor fails#21471

Merged
hnyls2002 merged 1 commit intomainfrom
lsyin/fix-detokenizer-unbound-error
Mar 26, 2026
Merged

Fix UnboundLocalError when DetokenizerManager constructor fails#21471
hnyls2002 merged 1 commit intomainfrom
lsyin/fix-detokenizer-unbound-error

Conversation

@hnyls2002
Copy link
Copy Markdown
Collaborator

Motivation

When the DetokenizerManager constructor fails (e.g., due to HF API 429 rate limiting during AutoTokenizer.from_pretrained), the except block in run_detokenizer_process references manager before it is assigned, raising UnboundLocalError. This prevents SIGQUIT from reaching the parent process, leaving the server in a half-dead state — it accepts HTTP connections but returns 503 on /health_generate indefinitely until the test timeout (~10 minutes).

Example failure: https://github.com/sgl-project/sglang/actions/runs/23582037327/job/68670070787#step:7:5254

The server stays stuck returning 503 for ~10 minutes because SIGQUIT never fires:

[2026-03-26 08:44:08] INFO: 127.0.0.1:38764 - "GET /health_generate HTTP/1.1" 503 Service Unavailable
[2026-03-26 08:44:18] INFO: 127.0.0.1:46050 - "GET /health_generate HTTP/1.1" 503 Service Unavailable
... (repeats for ~10 minutes)
[2026-03-26 08:52:38] INFO: 127.0.0.1:39568 - "GET /health_generate HTTP/1.1" 503 Service Unavailable
TimeoutError: Server failed to start within the timeout period

Root cause in the logs:

DetokenizerManager hit an exception:
  ...
  httpx.HTTPStatusError: Client error '429 Too Many Requests'
  ...
  OSError: Unable to load vocabulary from file.

Process Process-2:
  UnboundLocalError: local variable 'manager' referenced before assignment

Modification

Initialize manager = None before the try block and guard the maybe_clear_socket_mapping() call with a None check, so send_signal(SIGQUIT) always executes.

Test plan

  • Existing CI (no behavioral change for the success path)

When the DetokenizerManager constructor fails (e.g., due to HF API
rate limiting during tokenizer loading), the except block references
`manager` which was never assigned, causing an UnboundLocalError.
This prevents SIGQUIT from being sent to the parent process, leaving
the server in a half-dead state that returns 503 for ~10 minutes
until the test timeout.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness of the server's error handling mechanism. It resolves a critical issue where failures during the DetokenizerManager's initialization could lead to an UnboundLocalError, preventing proper signal propagation and leaving the server in a half-dead state. The changes ensure that the server gracefully exits and cleans up resources even when core components fail to start, improving overall system stability.

Highlights

  • Error Handling Fix: Addressed an UnboundLocalError that occurred when the DetokenizerManager constructor failed, preventing the SIGQUIT signal from being sent to the parent process and leaving the server in an unresponsive state.
  • Robust Shutdown: Ensured that the SIGQUIT signal is always sent to the parent process upon DetokenizerManager initialization failure by initializing manager to None and adding a null check before attempting to clear socket mappings.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the robustness of the run_detokenizer_process function by initializing the manager variable to None and adding a None check before attempting to call manager.maybe_clear_socket_mapping() within the exception handler. This prevents potential AttributeError if the manager fails to initialize. The review suggests a further improvement to error handling by wrapping the maybe_clear_socket_mapping() call in its own try...except block to ensure the critical parent_process.send_signal(signal.SIGQUIT) is always executed, even if the cleanup operation itself raises an exception, thus preventing a half-dead server state.

Comment on lines +448 to +449
if manager is not None:
manager.maybe_clear_socket_mapping()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To improve robustness, it's advisable to wrap the cleanup logic in a try...except block. This ensures that parent_process.send_signal(signal.SIGQUIT) is always executed, even if manager.maybe_clear_socket_mapping() were to raise an unexpected exception. The main goal is to prevent the server from entering a half-dead state, so guaranteeing the signal is sent is critical.

Suggested change
if manager is not None:
manager.maybe_clear_socket_mapping()
if manager is not None:
try:
manager.maybe_clear_socket_mapping()
except Exception as e_cleanup:
logger.error(f"Error during detokenizer cleanup: {e_cleanup}")

@hnyls2002
Copy link
Copy Markdown
Collaborator Author

/tag-and-rerun-ci

@hnyls2002 hnyls2002 merged commit e5b7650 into main Mar 26, 2026
91 of 128 checks passed
@hnyls2002 hnyls2002 deleted the lsyin/fix-detokenizer-unbound-error branch March 26, 2026 20:00
satyamk7054 pushed a commit to satyamk7054/sglang that referenced this pull request Apr 3, 2026
JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant