Skip to content

Add otel tracing instrumentation#506

Merged
github-actions[bot] merged 10 commits into
llm-d:mainfrom
sallyom:tracing
Feb 25, 2026
Merged

Add otel tracing instrumentation#506
github-actions[bot] merged 10 commits into
llm-d:mainfrom
sallyom:tracing

Conversation

@sallyom
Copy link
Copy Markdown
Contributor

@sallyom sallyom commented Dec 5, 2025

This PR satisfies the llm-d distributed tracing proposal llm-d/llm-d#119 implementation plan.

@github-actions
Copy link
Copy Markdown
Contributor

This PR is marked as stale after 21d of inactivity. After an additional 14d of inactivity (7d to become rotten, then 7d more), it will be closed. To prevent this PR from being closed, add a comment or remove the lifecycle/stale label.

@sallyom sallyom force-pushed the tracing branch 4 times, most recently from 9c56950 to c55c5a6 Compare January 5, 2026 20:14
@sallyom sallyom force-pushed the tracing branch 3 times, most recently from 884c096 to 8c4a03d Compare January 7, 2026 04:20
@sallyom sallyom changed the title [WIP] Tracing Add otel tracing instrumentation Jan 7, 2026
@sallyom
Copy link
Copy Markdown
Contributor Author

sallyom commented Jan 7, 2026

The GAIE entry request span PR is kubernetes-sigs/gateway-api-inference-extension#2057

@sallyom sallyom force-pushed the tracing branch 7 times, most recently from 24dc5b9 to 15eb137 Compare January 7, 2026 05:44
Comment thread pkg/plugins/scorer/precise_prefix_cache.go Outdated
Comment thread Dockerfile.localdev.epp Outdated
Comment thread Dockerfile.localdev.epp Outdated
Comment thread cmd/epp/main.go Outdated
Comment thread cmd/pd-sidecar/main.go Outdated
Comment thread pkg/telemetry/tracing.go Outdated
func (s *Server) chatCompletionsHandler(w http.ResponseWriter, r *http.Request) {
requestStart := time.Now()
tracer := telemetry.Tracer()
ctx, span := tracer.Start(r.Context(), "llm_d.pd_proxy.request",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read the proposal part seem iit has decided to use llm_d.pd_proxy.

not for this PR, but should we change the folder to pkg/pd/proxy than pkg/sidecar/proxy for better naming?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Down the road we may have the sidecar proxy handle more than P/D disagg so I would suggest not renaming.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok leaving as/is for this PR

Comment thread Dockerfile.localdev.epp Outdated
Comment thread pkg/telemetry/tracing.go Outdated
@github-actions github-actions Bot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 12, 2026
sallyom and others added 7 commits February 23, 2026 16:27
    Add centralized telemetry package and custom spans
    following the llm-d distributed tracing proposal.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: sallyom <somalley@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: sallyom <somalley@redhat.com>
@sallyom
Copy link
Copy Markdown
Contributor Author

sallyom commented Feb 23, 2026

Usually I see a 'Verified' tag on the signed commits:

Commits show as verified now - I had to add a new signing key.

wrt the linting error, the test is complaining about the package name common but this PR isn't introducing that package name, it's pre-existing.

@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Feb 23, 2026

/approve

github-actions[bot]
github-actions Bot previously approved these changes Feb 23, 2026
@github-project-automation github-project-automation Bot moved this to In review in llm-d-router Feb 23, 2026
@elevran
Copy link
Copy Markdown
Collaborator

elevran commented Feb 24, 2026

@sallyom thanks for following through on the feedback.

wrt the linting error, the test is complaining about the package name common but this PR isn't introducing that package name, it's pre-existing.

The PR adds the _test.go file. Unlike the existing file, it lacks a linter bypass comment (//revive:disable:var-naming) so it is flagged.

I'm looking at the signature requirement:

  • the github UI shows commits as signed
  • the github cli shows them as unsigned
$ git fetch upstream pull/506/head:pr-506
$ git checkout pr-506
$ git log --pretty="%H %s" --show-signature 2>/dev/null
No signature
693a0294eb5801e2715e815f7b15483dd68c33d1 remove extra comments from sidecar spans
No signature
9aedcc978cb9c85200eb73217985c6746b56cd53 test: add edge case tests for StripScheme
No signature
5ddef7a5831272d6c906bb92693f68d807676eff fix: address review nits for tracing PR
No signature
104f3c8cc7c689c750a165adabf2587d9c8753c2 fix: avoid os.Exit bypassing defer in main
No signature
da3c41640860e0d1b47f6947fd9bf3830bc87f14 tracing: remove extra success results & startup spans and cleanup
No signature
3a31c9d5999e263f814a477a52548d182fadcdb0 update Dockerfile.sidecar
No signature
4cf00716d55c55bc5286481a2ba2474bd66b28bc Add opentelemetry tracing
gpg: Signature made Thu Feb 19 19:28:16 2026 IST
gpg:                using RSA key B5690EEEBB952194
gpg: Can't check signature: No public key
...

@llm-d llm-d deleted a comment from github-actions Bot Feb 24, 2026
@elevran
Copy link
Copy Markdown
Collaborator

elevran commented Feb 24, 2026

worked around the signing verification be reruning the test after deleting "the found unsigned commits" comment (perhaps there was some left over state - who knows :-)).
Anyway - still need to fix the lint error and then we are good to go!

Signed-off-by: sallyom <somalley@redhat.com>
Comment thread pkg/sidecar/proxy/chat_completions.go Outdated
Gregory-Pereira and others added 2 commits February 24, 2026 13:11
Signed-off-by: greg pereira <grpereir@redhat.com>
protect against segfault on tests
@Gregory-Pereira Gregory-Pereira enabled auto-merge (squash) February 24, 2026 21:59
@github-actions github-actions Bot merged commit 1519a28 into llm-d:main Feb 25, 2026
8 checks passed
@github-project-automation github-project-automation Bot moved this from In review to Done in llm-d-router Feb 25, 2026
zdtsw pushed a commit to zdtsw/llm-d-router that referenced this pull request Mar 23, 2026
* Add opentelemetry tracing

    Add centralized telemetry package and custom spans
    following the llm-d distributed tracing proposal.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* update Dockerfile.sidecar

Signed-off-by: sallyom <somalley@redhat.com>

* tracing: remove extra success results & startup spans and cleanup

Signed-off-by: sallyom <somalley@redhat.com>

* fix: avoid os.Exit bypassing defer in main

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* fix: address review nits for tracing PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* test: add edge case tests for StripScheme

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* remove extra comments from sidecar spans

Signed-off-by: sallyom <somalley@redhat.com>

* fix lint error

Signed-off-by: sallyom <somalley@redhat.com>

* protect against segfault on tests

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: greg pereira <grpereir@redhat.com>
(cherry picked from commit 1519a28)
zdtsw pushed a commit to red-hat-data-services/llm-d-inference-scheduler that referenced this pull request Mar 23, 2026
* Add opentelemetry tracing

    Add centralized telemetry package and custom spans
    following the llm-d distributed tracing proposal.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* update Dockerfile.sidecar

Signed-off-by: sallyom <somalley@redhat.com>

* tracing: remove extra success results & startup spans and cleanup

Signed-off-by: sallyom <somalley@redhat.com>

* fix: avoid os.Exit bypassing defer in main

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* fix: address review nits for tracing PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* test: add edge case tests for StripScheme

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* remove extra comments from sidecar spans

Signed-off-by: sallyom <somalley@redhat.com>

* fix lint error

Signed-off-by: sallyom <somalley@redhat.com>

* protect against segfault on tests

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: greg pereira <grpereir@redhat.com>
(cherry picked from commit 1519a28)
(cherry picked from commit 215658e)
asharkhan3101 pushed a commit to asharkhan3101/llm-d-inference-scheduler that referenced this pull request Apr 13, 2026
* Add opentelemetry tracing

    Add centralized telemetry package and custom spans
    following the llm-d distributed tracing proposal.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* update Dockerfile.sidecar

Signed-off-by: sallyom <somalley@redhat.com>

* tracing: remove extra success results & startup spans and cleanup

Signed-off-by: sallyom <somalley@redhat.com>

* fix: avoid os.Exit bypassing defer in main

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* fix: address review nits for tracing PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* test: add edge case tests for StripScheme

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: sallyom <somalley@redhat.com>

* remove extra comments from sidecar spans

Signed-off-by: sallyom <somalley@redhat.com>

* fix lint error

Signed-off-by: sallyom <somalley@redhat.com>

* protect against segfault on tests

Signed-off-by: greg pereira <grpereir@redhat.com>

---------

Signed-off-by: sallyom <somalley@redhat.com>
Signed-off-by: greg pereira <grpereir@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: greg pereira <grpereir@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm "Looks good to me", indicates that a PR is ready to be merged.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants