feat: add spanRequestHeaderAttributes and ENV mapping for CLI by codefromthecrypt · Pull Request #1269 · envoyproxy/ai-gateway

codefromthecrypt · 2025-10-03T11:28:26Z

Description

This uses the same header-to-attribute mapping approach for both OpenTelemetry spans and metrics, enabling session tracking and custom attribute propagation without requiring code instrumentation.

Specifically, this deprecates metricsRequestHeaderLabels which was artificially Prometheus specific in favor of:

Kubernetes/extproc

spanRequestHeaderAttributes: "x-session-id:session.id,x-user-id:user.id"
metricsRequestHeaderAttributes: "x-team-id:team.id,x-user-id:user.id"

Note: Before we told people to lower_snake_case label values, but you don't need to do that because the prometheus exporter already does that. We shouldn't make mapping decisions like this in aigw as it interferes with non-prometheus metrics systems. This is particularly highlighted in "session.id" which is handled where if you made it "session_id" it wouldn't because the latter isn't an otel convention.

CLI (aigw run)
To match convention of existing otel config, use ENV vars

OTEL_AIGW_SPAN_REQUEST_HEADER_ATTRIBUTES="x-session-id:session.id,x-user-id:user.id"
OTEL_AIGW_METRICS_REQUEST_HEADER_ATTRIBUTES="x-team-id:team.id,x-user-id:user.id"

Example App

I built from scratch like this

make build.aigw GOOS_LIST=linux
cd cmd/aigw
COMPOSE_PROFILES=phoenix docker compose -f docker-compose-otel.yaml up --build --wait -d

Then I ran this

# run like this: uv run --exact -q --env-file .env main.py
#
# # customizing .env like:
# OPENAI_BASE_URL=http://localhost:1975/v1
# OPENAI_API_KEY=unused
# CHAT_MODEL=qwen3:4b
#
# # Soon we can do this maybe...
# MCP_URL=https://localhost:1975/mcp
#
# /// script
# dependencies = [
#     "openai-agents~=0.3.2",
#     "httpx~=0.28.1",
#     "mcp~=1.15.0",
# ]
# ///
import asyncio
import os

import httpx
from openai import AsyncOpenAI

from agents import (
    Agent,
    OpenAIProvider,
    RunConfig,
    Runner,
    Tool,
    get_current_trace,
    set_trace_processors,
)
from agents.mcp import MCPServerStreamableHttp, MCPUtil

# Disable OpenAI Platform trace callbacks to avoid 401s
set_trace_processors([])


async def add_session_id_header(request: httpx.Request):
    """Event hook to add x-session-id header from current trace context"""
    trace = get_current_trace()
    if trace and trace.trace_id:
        request.headers["x-session-id"] = trace.trace_id


async def run_agent(tools: list[Tool]):
    model_name = os.getenv("CHAT_MODEL", "gpt-4o-mini")

    # Create custom HTTP client with session ID header injection
    http_client = httpx.AsyncClient(event_hooks={"request": [add_session_id_header]})

    openai_client = AsyncOpenAI(http_client=http_client)
    provider = OpenAIProvider(openai_client=openai_client, use_responses=False)
    model = provider.get_model(model_name)

    agent = Agent(
        name="code_agent",
        model=model,
        tools=tools,
    )

    result = await Runner.run(
        starting_agent=agent,
        input="Create envoy yaml that changes the admin port to 9999. use context7",
        run_config=RunConfig(workflow_name="postgres"),
    )
    print(result.final_output)


async def main():
    # Connect to an MCP server that has context7 tools registered
    mcp_url = os.getenv("MCP_URL", "https://mcp.context7.com/mcp")
    async with MCPServerStreamableHttp(
        {
            "url": mcp_url,
            "timeout": 30.0,
        },
        cache_tools_list=True,
    ) as server:
        tools = await server.list_tools()
        util = MCPUtil()
        tools = [util.to_function_tool(tool, server, False) for tool in tools]
        await run_agent(tools)


if __name__ == "__main__":
    asyncio.run(main())

Then, my phoenix session had the expected session ID, mapped to the openai agent trace ID:

This includes all 3 LLM spans. This shows you can group the whole conversation even when normal tracing isn't setup on the client.

Related Issues/PRs

Fixes #1221

Special notes for reviewers

Backward compatibility for the previous metricsRequestHeaderLabels flag:

Old flag: --metricsRequestHeaderLabels / controller.metricsRequestHeaderLabels
New flag: --metricsRequestHeaderAttributes / controller.metricsRequestHeaderAttributes
Fallback logic: If new flag is unset, old flag value is used

**Description** This adds header-to-attribute mapping for both OpenTelemetry spans and metrics, enabling session tracking and custom attribute propagation without requiring code instrumentation. Specifically, this deprecates `metricsRequestHeaderLabels` which was artificially Prometheus specific in favor of: **Kubernetes/extproc** - `spanRequestHeaderAttributes`: "x-session-id:session.id,x-user-id:user.id" - `metricsRequestHeaderAttributes`: "x-team-id:team.id,x-user-id:user.id" **CLI (aigw run)** To match convention of existing otel config, use ENV vars - OTEL_AIGW_SPAN_REQUEST_HEADER_ATTRIBUTES="x-session-id:session.id,x-user-id:user.id" - OTEL_AIGW_METRICS_REQUEST_HEADER_ATTRIBUTES="x-team-id:team.id,x-user-id:user.id" **Backward Compatibility** Full backward compatibility for the previous metricsRequestHeaderLabels flag: - Old flag: --metricsRequestHeaderLabels / controller.metricsRequestHeaderLabels - New flag: --metricsRequestHeaderAttributes / controller.metricsRequestHeaderAttributes - Fallback logic: If new flag is unset, old flag value is used - Deprecation warnings: Logged when old flag is detected - Removal timeline: Deprecated flag will be removed after v0.4 **Documentation** - site/docs/capabilities/observability/tracing.md#session-tracking - site/docs/cli/run.md#header-mapping - cmd/aigw/docker-compose-otel.yaml Signed-off-by: Adrian Cole <adrian@tetrate.io>

codecov-commenter · 2025-10-03T11:31:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.64%. Comparing base (9425769) to head (57a9542).
⚠️ Report is 1 commits behind head on main.

❌ Your project status has failed because the head coverage (77.64%) is below the target coverage (86.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1269      +/-   ##
==========================================
+ Coverage   77.60%   77.64%   +0.03%     
==========================================
  Files         116      116              
  Lines       15192    15205      +13     
==========================================
+ Hits        11790    11806      +16     
+ Misses       2812     2808       -4     
- Partials      590      591       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake

Thanks for the cleanup & improvement as well on the existing header mapping!

…roxy#1269) **Description** This uses the same header-to-attribute mapping approach for both OpenTelemetry spans and metrics, enabling session tracking and custom attribute propagation without requiring code instrumentation. Specifically, this deprecates `metricsRequestHeaderLabels` which was artificially Prometheus specific in favor of: **Kubernetes/extproc** - `spanRequestHeaderAttributes`: "x-session-id:session.id,x-user-id:user.id" - `metricsRequestHeaderAttributes`: "x-team-id:team.id,x-user-id:user.id" Note: Before we told people to lower_snake_case label values, but you don't need to do that because the prometheus exporter already does that. We shouldn't make mapping decisions like this in aigw as it interferes with non-prometheus metrics systems. This is particularly highlighted in "session.id" which is handled where if you made it "session_id" it wouldn't because the latter isn't an otel convention. **Related Issues/PRs** Fixes envoyproxy#1221 --------- Signed-off-by: Adrian Cole <adrian@tetrate.io> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>

codefromthecrypt requested a review from a team as a code owner October 3, 2025 11:28

codefromthecrypt mentioned this pull request Oct 3, 2025

Support emitting session.id on LLM spans in Envoy AI Gateway’s OpenTelemetry export for Phoenix session grouping #1221

Closed

mathetake added 2 commits October 3, 2025 11:22

Merge remote-tracking branch 'origin/main' into otel-attributes

fa67670

drift

57a9542

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake approved these changes Oct 3, 2025

View reviewed changes

mathetake merged commit 6a2d917 into envoyproxy:main Oct 3, 2025
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: add spanRequestHeaderAttributes and ENV mapping for CLI#1269

feat: add spanRequestHeaderAttributes and ENV mapping for CLI#1269
mathetake merged 3 commits intoenvoyproxy:mainfrom
codefromthecrypt:otel-attributes

codefromthecrypt commented Oct 3, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Oct 3, 2025 •

edited

Loading

Uh oh!

mathetake left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

codefromthecrypt commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mathetake left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codefromthecrypt commented Oct 3, 2025 •

edited

Loading

codecov-commenter commented Oct 3, 2025 •

edited

Loading