Skip to content

[sglang-miles] Cherry-pick #24767: Make request dump robust to unpicklable server_args and large meta_info#24902

Merged
ByronHsu merged 1 commit into
sglang-milesfrom
byron/cp-24767-to-sglang-miles
May 10, 2026
Merged

[sglang-miles] Cherry-pick #24767: Make request dump robust to unpicklable server_args and large meta_info#24902
ByronHsu merged 1 commit into
sglang-milesfrom
byron/cp-24767-to-sglang-miles

Conversation

@ByronHsu
Copy link
Copy Markdown
Collaborator

Summary

Cherry-pick of #24767 (merge commit 1e6c6d1) onto sglang-miles.

  • Wraps pickle dump in try/except; on failure retries with server_args=None so request data is still persisted
  • Adds configurable dump_requests_exclude_meta_keys to strip bulky keys (routed_experts, hidden_states) from meta_info in dumps
  • Surfaces the option in the configure_logging CLI as --dump-requests-exclude-meta-keys

Conflict resolution: None — clean cherry-pick (auto-merged).

Test plan

  • Existing request dump tests should pass
  • Manual verification with --trust-remote-code models that previously caused pickle failures

Made with Cursor

… meta_info (#24767)

Co-authored-by: Byron Hsu <byron@periodiclabs.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>
(cherry picked from commit 1e6c6d1)
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a feature to exclude specific keys from request metadata dumps to optimize storage and avoid serialization issues. It also adds a retry mechanism that attempts to dump data without server_args if the initial pickling fails. Feedback highlights the need for consistent metadata filtering across all dump types and suggests making the retry logic more resilient by catching potential errors during the fallback serialization step.

Comment on lines +2039 to +2047
if self.dump_requests_exclude_meta_keys and isinstance(
out_dict.get("meta_info"), dict
):
exclude = self.dump_requests_exclude_meta_keys
if any(k in out_dict["meta_info"] for k in exclude):
filtered_meta = {
k: v for k, v in out_dict["meta_info"].items() if k not in exclude
}
out_dict = {**out_dict, "meta_info": filtered_meta}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The meta_info filtering logic is correctly implemented here for periodic request dumps. However, this filtering is not applied to unfinished requests in dump_requests_before_crash or to requests recorded in record_request_for_crash_dump. If the goal is to exclude bulky keys from all dumps to save space and avoid pickling issues, consider applying this filtering consistently across all dumping paths. Since these functions are not all within the current diff hunks, ensure the logic is shared or repeated where necessary.

Comment on lines +2099 to +2112
try:
pickle.dump(to_dump_with_server_args, f)
except Exception as e:
# When the server is launched with --trust-remote-code,
# server_args sometimes fails to pickle. Retry without
# server_args so the request data still gets persisted.
logger.error(
f"Failed to pickle dump with server_args: {e!r}; "
"retrying without server_args"
)
f.seek(0)
f.truncate()
to_dump_with_server_args["server_args"] = None
pickle.dump(to_dump_with_server_args, f)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider making the retry logic more robust by wrapping the second pickle.dump in a try...except block. If the failure is caused by something other than server_args (e.g., unpicklable data in the requests themselves), the current implementation will raise an unhandled exception in the background thread. Also, note that this logic is duplicated in dump_requests_before_crash.

Suggested change
try:
pickle.dump(to_dump_with_server_args, f)
except Exception as e:
# When the server is launched with --trust-remote-code,
# server_args sometimes fails to pickle. Retry without
# server_args so the request data still gets persisted.
logger.error(
f"Failed to pickle dump with server_args: {e!r}; "
"retrying without server_args"
)
f.seek(0)
f.truncate()
to_dump_with_server_args["server_args"] = None
pickle.dump(to_dump_with_server_args, f)
try:
pickle.dump(to_dump_with_server_args, f)
except Exception as e:
# When the server is launched with --trust-remote-code,
# server_args sometimes fails to pickle. Retry without
# server_args so the request data still gets persisted.
logger.error(
f"Failed to pickle dump with server_args: {e!r}; "
"retrying without server_args"
)
f.seek(0)
f.truncate()
to_dump_with_server_args["server_args"] = None
try:
pickle.dump(to_dump_with_server_args, f)
except Exception as e2:
logger.error(f"Failed to pickle dump even without server_args: {e2!r}")

Comment on lines +2171 to +2184
try:
pickle.dump(data_to_dump_with_server_args, f)
except Exception as e:
# When the server is launched with --trust-remote-code,
# server_args sometimes fails to pickle. Retry without
# server_args so the request data still gets persisted.
logger.error(
f"Failed to pickle dump with server_args: {e!r}; "
"retrying without server_args"
)
f.seek(0)
f.truncate()
data_to_dump_with_server_args["server_args"] = None
pickle.dump(data_to_dump_with_server_args, f)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the periodic dump, consider wrapping the second pickle.dump in a try...except block here. Since this is called during crash handling, an unhandled exception here could interfere with the shutdown process or prevent other cleanup tasks from completing.

Suggested change
try:
pickle.dump(data_to_dump_with_server_args, f)
except Exception as e:
# When the server is launched with --trust-remote-code,
# server_args sometimes fails to pickle. Retry without
# server_args so the request data still gets persisted.
logger.error(
f"Failed to pickle dump with server_args: {e!r}; "
"retrying without server_args"
)
f.seek(0)
f.truncate()
data_to_dump_with_server_args["server_args"] = None
pickle.dump(data_to_dump_with_server_args, f)
try:
pickle.dump(data_to_dump_with_server_args, f)
except Exception as e:
# When the server is launched with --trust-remote-code,
# server_args sometimes fails to pickle. Retry without
# server_args so the request data still gets persisted.
logger.error(
f"Failed to pickle dump with server_args: {e!r}; "
"retrying without server_args"
)
f.seek(0)
f.truncate()
data_to_dump_with_server_args["server_args"] = None
try:
pickle.dump(data_to_dump_with_server_args, f)
except Exception as e2:
logger.error(f"Failed to pickle dump even without server_args: {e2!r}")

@ByronHsu ByronHsu merged commit 29a9542 into sglang-miles May 10, 2026
2 checks passed
@ByronHsu ByronHsu deleted the byron/cp-24767-to-sglang-miles branch May 10, 2026 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant