fix(sglang): use external_trace_header API for distributed tracing by ishandhanani · Pull Request #5346 · ai-dynamo/dynamo

ishandhanani · 2026-01-12T06:35:07Z

Summary

Replace internal SGLang trace propagation (trace_set_remote_propagate_context) with the public external_trace_header parameter
Remove dependency on internal sglang.srt.tracing module
Simplify trace context by using W3C traceparent header directly

Background

This PR extracts the still-needed portions from #5122 after #5283 merged the Rust TCP tracing support. The Python SGLang changes use SGLang's official external_trace_header API instead of the internal tracing module, which is cleaner and more maintainable.

Changes

handler_base.py: Replace _propagate_trace_context_to_sglang() with _get_trace_header()
decode_handler.py: Pass external_trace_header to async_generate() calls
prefill_handler.py: Pass external_trace_header to async_generate() calls

Replace internal SGLang trace propagation with the public external_trace_header parameter. This simplifies trace context propagation by: - Using SGLang's official external_trace_header API parameter - Removing dependency on internal sglang.srt.tracing module - Eliminating base64 encoding/decoding complexity - Aligning with W3C trace context standard (traceparent header) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

copy-pr-bot · 2026-01-12T06:35:10Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-01-12T06:39:00Z

Walkthrough

Refactors trace propagation across sglang request handlers by replacing a complex base64-encoded JSON approach with a simpler external_trace_header parameter mechanism. Removes _propagate_trace_context_to_sglang() method and introduces _get_trace_header() returning a dict with traceparent key, eliminating encoding overhead.

Changes

Cohort / File(s)	Summary
Core trace handler implementation `components/src/dynamo/sglang/request_handlers/handler_base.py`	Removes `_propagate_trace_context_to_sglang()` method; adds `_get_trace_header()` returning `Optional[Dict[str, str]]`. Removes imports for base64, json, and sglang_trace. Simplifies trace context from encoded JSON payload to plain traceparent dict.
Request handler integrations `components/src/dynamo/sglang/request_handlers/llm/decode_handler.py`, `components/src/dynamo/sglang/request_handlers/llm/prefill_handler.py`	Updates trace handling to call `_get_trace_header(context)` and pass result via new `external_trace_header` parameter to `async_generate()`. Removes direct trace propagation calls. Applied consistently across disaggregated and aggregated modes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Trace paths simplified, no more encoding schemes,
Headers flow directly through our generation dreams,
Base64 gone, complexity shed,
A cleaner hop forward instead! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: replacing internal SGLang trace propagation with the public external_trace_header API for distributed tracing.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The pull request description includes all required sections: Overview (Summary), Details (Changes and Background), and Related Issues context.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (1)

122-124: Consider extracting trace header computation to reduce duplication.

The trace header logic is duplicated in both branches. Since trace_header doesn't depend on serving_mode, it could be computed once before the conditional.

♻️ Suggested refactor

         )

+        trace_header = self._get_trace_header(context) if self.enable_trace else None
+
         if self.serving_mode == DisaggregationMode.DECODE:
             # Check if bootstrap_info is pre-computed in the request (from frontend)
             bootstrap_info = request.get("bootstrap_info")
@@ -119,10 +121,6 @@
                 f"room={bootstrap_info['bootstrap_room']}"
             )

-            trace_header = (
-                self._get_trace_header(context) if self.enable_trace else None
-            )
-
             decode = await self.engine.async_generate(
                 **input_param,
                 sampling_params=sampling_params,
@@ -137,10 +135,6 @@
             else:
                 async for out in self._process_text_stream(decode, context):
                     yield out
         else:
-            trace_header = (
-                self._get_trace_header(context) if self.enable_trace else None
-            )
-
             agg = await self.engine.async_generate(

Also applies to: 144-146

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c94d097 and ab315ee.

📒 Files selected for processing (3)

components/src/dynamo/sglang/request_handlers/handler_base.py
components/src/dynamo/sglang/request_handlers/llm/decode_handler.py
components/src/dynamo/sglang/request_handlers/llm/prefill_handler.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-19T07:32:44.210Z

Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 0
File: :0-0
Timestamp: 2025-09-19T07:32:44.210Z
Learning: The skip_tokenizer_init=True path in SGLang backend bypasses tokenization but has array slicing overhead in _process_token_stream that creates O(n) memory copying on every stream chunk, potentially causing quadratic behavior for long sequences.

Applied to files:

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py

🧬 Code graph analysis (2)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (1)

components/src/dynamo/sglang/request_handlers/handler_base.py (1)

_get_trace_header (143-156)

components/src/dynamo/sglang/request_handlers/llm/prefill_handler.py (1)

components/src/dynamo/sglang/request_handlers/handler_base.py (1)

_get_trace_header (143-156)

🔇 Additional comments (3)

components/src/dynamo/sglang/request_handlers/llm/decode_handler.py (1)

126-135: LGTM!

The external_trace_header parameter is correctly passed to async_generate() in both disaggregated and aggregated modes, properly propagating trace context.

Also applies to: 148-154

components/src/dynamo/sglang/request_handlers/llm/prefill_handler.py (1)

116-127: LGTM!

The trace header extraction and propagation via external_trace_header is cleanly integrated. The implementation is consistent with the decode handler and properly conditional on enable_trace.

components/src/dynamo/sglang/request_handlers/handler_base.py (1)

143-156: Clean implementation of W3C traceparent header generation.

The method correctly constructs the W3C Trace Context traceparent header format (00-trace_id-parent_id-trace_flags), which is the exact format SGLang's external_trace_header parameter expects. The hardcoded flags value 01 indicates sampled=true, appropriate for traces being actively propagated.

When spawn_prefill_task uses tokio::spawn, the spawned task loses the current span context. This causes distributed tracing context to be lost, preventing trace headers from being properly propagated. Changes: - Add tracing::Instrument import - Capture current span and use .instrument(span) to propagate trace context to the spawned prefill task - Add prefill_routing span to track prefill routing timing - Add kv_find_best_match span to track KV worker selection time Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ishandhanani · 2026-01-12T15:33:47Z

Session Worklog

Context

User had an existing PR #5122 (ishan/tracingupdate branch) for distributed tracing improvements. A separate PR #5283 was merged that added TCP tracing support with headers in the wire protocol. Task was to identify what portions of #5122 were still needed.

Analysis

Compared branches and found:

Rust changes in fix: distributed tracing propagation for TCP transport and SGLang integration #5122 were NOT needed - PR fix: distributed tracing propagation for TCP transport #5283 already merged better versions:
- logging.rs: Main has optimized static TRACE_PROPAGATOR, the old branch created new propagators each time
- tcp_client.rs tests: Main has proper header reading code
Python SGLang changes WERE still needed:
- handler_base.py: Replace internal trace_set_remote_propagate_context with public external_trace_header API
- decode_handler.py / prefill_handler.py: Pass external_trace_header to async_generate()
Additional Rust fix discovered:
- Commit ade6a7e3e on original branch fixed span context loss in spawn_prefill_task
- When tokio::spawn is used, the spawned task loses current span context
- Fix: Capture span with tracing::Span::current() and use .instrument(span)

Implementation

Created new branch ishan/sglang-external-trace-header from main
Applied Python changes for external_trace_header API
Cherry-picked Rust tracing fix (resolved merge conflicts manually due to code divergence)
Added prefill_routing and kv_find_best_match spans for visibility
Verified compilation with cargo check -p dynamo-llm
Rebuilt dynamo and tested with disaggregated serving

Performance Verification

Confirmed .instrument() pattern is safe for high concurrency:

tracing::Span::current() - thread-local read, O(1)
.instrument(span) - zero-cost when tracing disabled
Same pattern used in http_endpoint.rs, shared_tcp_endpoint.rs, push_endpoint.rs
Spans use static names with no dynamic field allocation

Files Changed

File	Lines	Change
`handler_base.py`	-20/+8	Simplified trace header generation
`decode_handler.py`	+4/-3	Use `external_trace_header` param
`prefill_handler.py`	+2/-3	Use `external_trace_header` param
`prefill_router.rs`	+66/-47	Span propagation + new spans

Commits

ab315ee8a - fix(sglang): use external_trace_header API for distributed tracing
0198fbde8 - fix(tracing): propagate span context in prefill background task

ishandhanani · 2026-01-12T16:07:38Z

/ok to test 0198fbd

ishandhanani requested review from a team as code owners January 12, 2026 06:35

pull-request-size bot added the size/M label Jan 12, 2026

github-actions bot added fix backend::sglang Relates to the sglang backend labels Jan 12, 2026

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

ishandhanani requested a review from a team as a code owner January 12, 2026 06:59

pull-request-size bot added size/L and removed size/M labels Jan 12, 2026

github-actions bot added the router Relates to routing, KV-aware routing, etc. label Jan 12, 2026

PeaBrane approved these changes Jan 12, 2026

View reviewed changes

ishandhanani enabled auto-merge (squash) January 12, 2026 16:07

tmonty12 approved these changes Jan 12, 2026

View reviewed changes

ishandhanani merged commit cbbaa6b into main Jan 12, 2026
41 checks passed

ishandhanani deleted the ishan/sglang-external-trace-header branch January 12, 2026 17:01

ishandhanani mentioned this pull request Jan 15, 2026

fix: distributed tracing propagation for TCP transport and SGLang integration #5122

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sglang): use external_trace_header API for distributed tracing#5346

fix(sglang): use external_trace_header API for distributed tracing#5346
ishandhanani merged 2 commits intomainfrom
ishan/sglang-external-trace-header

ishandhanani commented Jan 12, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Jan 12, 2026

Uh oh!

coderabbitai bot commented Jan 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

ishandhanani commented Jan 12, 2026

Uh oh!

ishandhanani commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ishandhanani commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Uh oh!

copy-pr-bot bot commented Jan 12, 2026

Uh oh!

coderabbitai bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ishandhanani commented Jan 12, 2026

Session Worklog

Context

Analysis

Implementation

Performance Verification

Files Changed

Commits

Uh oh!

ishandhanani commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ishandhanani commented Jan 12, 2026 •

edited

Loading

coderabbitai bot commented Jan 12, 2026 •

edited

Loading