Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug Report: Trace-ID Injection Missing Between OpenAI and vLLM Spans #2291

Open
1 task done
ronensc opened this issue Nov 13, 2024 · 3 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@ronensc
Copy link
Contributor

ronensc commented Nov 13, 2024

Which component is this bug for?

OpenAI Instrumentation

📜 Description

This bug is related to #1983. Similar to what's described there, the OpenAIInstrumentor doesn't propagate the trace context of the exported span through the LLM request.

This fix will also benefit other frameworks built on top of the OpenAI client library, such as CrewAI.

👟 Reproduction steps

  1. Run Jaeger as the trace collector
docker run --rm --name jaeger \
    -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
    -p 6831:6831/udp \
    -p 6832:6832/udp \
    -p 5778:5778 \
    -p 16686:16686 \
    -p 4317:4317 \
    -p 4318:4318 \
    -p 14250:14250 \
    -p 14268:14268 \
    -p 14269:14269 \
    -p 9411:9411 \
    jaegertracing/all-in-one:1.57
  1. Run vLLM
export OTEL_SERVICE_NAME="vllm-server"
export OTEL_EXPORTER_OTLP_TRACES_INSECURE=true
vllm serve meta-llama/Llama-3.2-1B-Instruct --otlp-traces-endpoint "grpc://localhost:4317"
  1. Run OpenAI client with OpenAIInstrumentor:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
from opentelemetry.trace import set_tracer_provider


trace_provider = TracerProvider(resource=Resource(attributes={"service.name": "OpenAI-client"}))
trace_provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint="grpc://localhost:4317", insecure=True)))
trace_provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
set_tracer_provider(trace_provider)


from opentelemetry.instrumentation.openai import OpenAIInstrumentor
OpenAIInstrumentor().instrument()

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test",
        }
    ],
    model="meta-llama/Llama-3.2-1B-Instruct",
)
print(chat_completion)

👍 Expected behavior

I should see in Jaeger WebUI one trace with 2 spans.

👎 Actual Behavior with Screenshots

I see 2 traces, each with a single span.

http://localhost:16686/

image

image

🤖 Python Version

No response

📃 Provide any additional context for the Bug.

No response

👀 Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

Yes I am willing to submit a PR!

Copy link

dosubot bot commented Nov 13, 2024

The issue you're experiencing with the OpenAIInstrumentor not propagating the trace context between OpenAI and vLLM spans could be related to how the trace context is being handled in the instrumentor. The OpenAIInstrumentor is supposed to propagate trace context by including it in the HTTP request headers, as verified by tests that check for the presence of the TRACEPARENT_HEADER_NAME in the request headers [1].

To address this issue, you might want to ensure that the trace context is correctly injected and propagated through the request headers. The _OpenAITracingWrapper class is responsible for injecting tracing information into OpenAI client requests by modifying the extra_headers in the request [2]. Make sure that this mechanism is functioning as expected in your setup.

Additionally, verify that the TracerProvider and SpanProcessor are correctly configured to manage spans and trace context. The TracerProvider should be set globally, and the spans should be processed and exported properly [3].

If the issue persists, consider checking for any discrepancies in the OpenAI client library version or any other dependencies that might affect trace context propagation [4].

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the bug Something isn't working label Nov 13, 2024
@nirga
Copy link
Member

nirga commented Nov 13, 2024

Thanks @ronensc! I imagine you plan on doing something similar to what you've done for Langchain?

@ronensc
Copy link
Contributor Author

ronensc commented Nov 13, 2024

Right. I believe the implementation for OpenAIInstrumentor would be simpler. I'll work on a PR and submit once it's ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants