-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[None][feat] Add opentelemetry tracing #5897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
3b904c6
[feat] Add OpenTelemetry tracing
5f15841
[feat] Modular otel_trace
ee83e80
[feat] Add trace to disagg server and add kv_cache info
4d3a59a
[docs] Add openTelemetry integration guide
zhanghaotong 89c2773
[chores] fix todo
zhanghaotong c987e06
fix
zhanghaotong 94126ca
[fix] remove opentelemetry package from requirements.txt
zhanghaotong 5938445
[chores] pre commit
zhanghaotong a4817fb
Merge branch 'main' into otlp-trace
zhanghaotong c54f281
[feat] use more accurate time correction
zhanghaotong c23205d
Merge branch 'main' into otlp-trace
zhanghaotong 680ae97
Merge branch 'main' into otlp-trace
zhanghaotong 4890590
Merge branch 'main' into otlp-trace
zhanghaotong 95fc55f
Merge branch 'main' into otlp-trace
zhanghaotong fec6530
pre-commit
zhanghaotong db621c7
use strEnum and rename ObservabilityConfig to OtlpConfig
zhanghaotong ab0fbed
use strenum
zhanghaotong 9d48874
Merge branch 'main' into otlp-trace
zhanghaotong 66ca6e5
fix
zhanghaotong a4c9325
Merge branch 'main' into otlp-trace
zhanghaotong 4162058
Fix llmapi test
dd8a9e0
Merge branch 'main' into otlp-trace
zhanghaotong 5c9bb15
add dataclass to MinimalInstances
zhanghaotong c9bd23f
Merge branch 'main' into otlp-trace
zhanghaotong e570f0b
Merge branch 'main' into otlp-trace
zhanghaotong 0859050
Merge branch 'main' into otlp-trace
zhanghaotong cfa9293
Merge branch 'main' into otlp-trace
zhanghaotong 7bacc81
Merge branch 'main' into otlp-trace
zhanghaotong File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| # OpenTelemetry Integration Guide | ||
|
|
||
| This guide explains how to setup OpenTelemetry tracing in TensorRT-LLM to monitor and debug your LLM inference services. | ||
|
|
||
| ## Install OpenTelemetry | ||
|
|
||
| Install the required OpenTelemetry packages: | ||
|
|
||
| ```bash | ||
| pip install \ | ||
| 'opentelemetry-sdk' \ | ||
| 'opentelemetry-api' \ | ||
| 'opentelemetry-exporter-otlp' \ | ||
| 'opentelemetry-semantic-conventions-ai' | ||
| ``` | ||
|
|
||
| ## Start Jaeger | ||
|
|
||
| You can start Jaeger with Docker: | ||
|
|
||
| ```bash | ||
| docker run --rm --name jaeger \ | ||
| -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \ | ||
| -p 6831:6831/udp \ | ||
| -p 6832:6832/udp \ | ||
| -p 5778:5778 \ | ||
| -p 16686:16686 \ | ||
| -p 4317:4317 \ | ||
| -p 4318:4318 \ | ||
| -p 14250:14250 \ | ||
| -p 14268:14268 \ | ||
| -p 14269:14269 \ | ||
| -p 9411:9411 \ | ||
| jaegertracing/all-in-one:1.57.0 | ||
| ``` | ||
|
|
||
| Or run the jaeger-all-in-one(.exe) executable from [the binary distribution archives](https://www.jaegertracing.io/download/): | ||
|
|
||
| ```bash | ||
| jaeger-all-in-one --collector.zipkin.host-port=:9411 | ||
| ``` | ||
|
|
||
| ## Setup environment variables and run TensorRT-LLM | ||
|
|
||
| Set up the environment variables: | ||
|
|
||
| ```bash | ||
| export JAEGER_IP=$(docker inspect --format '{{ .NetworkSettings.IPAddress }}' jaeger) | ||
| export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc | ||
| export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=grpc://$JAEGER_IP:4317 | ||
| export OTEL_EXPORTER_OTLP_TRACES_INSECURE=true | ||
| export OTEL_SERVICE_NAME="trt-server" | ||
| ``` | ||
|
|
||
| Then run TensorRT-LLM with OpenTelemetry, and make sure to set `return_perf_metrics` to true in the model configuration: | ||
|
|
||
| ```bash | ||
| trtllm-serve models/Qwen3-8B/ --otlp_traces_endpoint="$OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" | ||
| ``` | ||
|
|
||
| ## Send requests and find traces in Jaeger | ||
|
|
||
| You can send a request to the server and view the traces in [Jaeger UI](http://localhost:16686/). | ||
| The traces should be visible under the service name "trt-server". | ||
|
|
||
| ## Configuration for Disaggregated Serving | ||
|
|
||
| For disaggregated serving scenarios, the configuration for ctx server and gen server remains the same as the standalone model. For the proxy, you can configure it as follows: | ||
|
|
||
| ```yaml | ||
| # disagg_config.yaml | ||
| hostname: 127.0.0.1 | ||
| port: 8000 | ||
| backend: pytorch | ||
| context_servers: | ||
| num_instances: 1 | ||
| urls: | ||
| - "127.0.0.1:8001" | ||
| generation_servers: | ||
| num_instances: 1 | ||
| urls: | ||
| - "127.0.0.1:8002" | ||
| otlp_config: | ||
| otlp_traces_endpoint: "grpc://0.0.0.0:4317" | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.