Skip to content

feat(telemetry): set authority and enable TLS for telemetry backends#7904

Merged
zirain merged 6 commits intoenvoyproxy:mainfrom
codefromthecrypt:feat/otel-metrics-tls
Jan 15, 2026
Merged

feat(telemetry): set authority and enable TLS for telemetry backends#7904
zirain merged 6 commits intoenvoyproxy:mainfrom
codefromthecrypt:feat/otel-metrics-tls

Conversation

@codefromthecrypt
Copy link
Contributor

@codefromthecrypt codefromthecrypt commented Jan 9, 2026

What type of PR is this?
feat(telemetry): set authority and enable TLS for telemetry backends

What this PR does / why we need it:
Sets proper gRPC authority from Service/Backend metadata and enables TLS for Backend resources used by telemetry (metrics, tracing, access logs).

This allows direct use of cloud OTLP endpoints like Elastic Cloud, Datadog, etc. without an intermediate collector.

Which issue(s) this PR fixes:
Fixes #

Release Notes: Yes

Testing

See examples/otel-headers for a working example with commented-out TLS config.

Ran otel-tui on a different port to make sure it can't accidentally receive anything

$ AUTH_TOKEN=fake otel-tui --grpc 14317

Used ngrok to route a public TLS endpoint to it

$ ngrok http --upstream-protocol=http2 14317

Changed the example yaml to point to that instead of plain text and ran all steps

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: Backend
metadata:
  name: otel-collector
spec:
  endpoints:
    - fqdn:
        hostname: unconcentratedly-nonvintage-mirta.ngrok-free.dev
        port: 443
  tls:
    wellKnownCACertificates: System

Noticed connections coming to ngrok
Screenshot 2026-01-09 at 12 49 38 PM

Verified in otel-tui
Screenshot 2026-01-09 at 12 50 02 PM
Screenshot 2026-01-09 at 12 49 55 PM
Screenshot 2026-01-09 at 12 49 48 PM

@codefromthecrypt codefromthecrypt requested a review from a team as a code owner January 9, 2026 05:29
@netlify
Copy link

netlify bot commented Jan 9, 2026

Deploy Preview for cerulean-figolla-1f9435 ready!

Name Link
🔨 Latest commit 418664c
🔍 Latest deploy log https://app.netlify.com/projects/cerulean-figolla-1f9435/deploys/6968a0686dca1e00080ec41f
😎 Deploy Preview https://deploy-preview-7904--cerulean-figolla-1f9435.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@codefromthecrypt codefromthecrypt changed the title feat(telemetry): enable TLS for OTLP exports using system CA trust store feat(telemetry): enable TLS for Backend (telemetry) resources (metrics, tracing, access logs) Jan 9, 2026
@codefromthecrypt codefromthecrypt marked this pull request as draft January 9, 2026 05:41
@codefromthecrypt
Copy link
Contributor Author

will buff out some platform differences and put ready for review

@codefromthecrypt codefromthecrypt changed the title feat(telemetry): enable TLS for Backend (telemetry) resources (metrics, tracing, access logs) feat(telemetry): enable TLS for Backend resources (metrics, tracing, access logs) Jan 9, 2026
@codefromthecrypt codefromthecrypt changed the title feat(telemetry): enable TLS for Backend resources (metrics, tracing, access logs) feat(telemetry): set authority and enable TLS for Backend resources (metrics, tracing, access logs) Jan 9, 2026
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

❌ Patch coverage is 90.37037% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.78%. Comparing base (de5d9ac) to head (418664c).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/gatewayapi/listener.go 89.39% 4 Missing and 3 partials ⚠️
internal/xds/bootstrap/bootstrap.go 87.50% 2 Missing and 2 partials ⚠️
internal/utils/cert/cert.go 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7904      +/-   ##
==========================================
+ Coverage   72.75%   72.78%   +0.02%     
==========================================
  Files         235      237       +2     
  Lines       35380    35467      +87     
==========================================
+ Hits        25742    25813      +71     
- Misses       7801     7811      +10     
- Partials     1837     1843       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@codefromthecrypt codefromthecrypt marked this pull request as ready for review January 9, 2026 07:12
@codefromthecrypt
Copy link
Contributor Author

cc @zirain @arkodg hopefully this is the last big one at least for gRPC cloud otel endpoints.. next will be trying integrated auth and whatever might not work properly with that and/or metadata required to be consistent across such as service name.

@codefromthecrypt codefromthecrypt changed the title feat(telemetry): set authority and enable TLS for Backend resources (metrics, tracing, access logs) feat(telemetry): set authority and enable TLS for telemetry backends Jan 10, 2026
@arkodg arkodg added this to the v1.7.0-rc.1 Release milestone Jan 10, 2026
@arkodg arkodg requested review from guydc and zhaohuabing January 10, 2026 22:04
@codefromthecrypt
Copy link
Contributor Author

hi folks. I am planning to do a talk at ElasticON on Envoy AI Gateway being able to export data directly to a TLS endpoint (in this case Elastic Cloud). However, until I know if this will merge or not, I can't finalize my outline for the organizers. Can I get a yes or no on this to save insecurity on direction? cc @arkodg

zirain
zirain previously approved these changes Jan 14, 2026
}
// Set GRPC protocol for OTLP
for _, d := range ds {
d.Protocol = ir.GRPC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, non-blocking: it might be helpful to add a comment to the API to clarify that OTLP is only supported over gRPC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted the code comment. d.Protocol, I might be wrong but is used besides for telemetry, so not sure how to phrase this. we can do that in another PR or you can feel free to push a commit if you can think of a way!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also once this is released to Envoy we should be able to implement http transport here. the example yaml there uses logs, metrics and traced, just like the gRPC here does envoyproxy/envoy#43001

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted the code comment. d.Protocol, I might be wrong but is used besides for telemetry, so not sure how to phrase this. we can do that in another PR or you can feel free to push a commit if you can think of a way!

I mean in the API, not the internal code. Just nit picking, not blocking.

zhaohuabing
zhaohuabing previously approved these changes Jan 14, 2026
Copy link
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some small non-blocking nits. Thanks so much for your patience and for contributing to this essential feature 🙌

…metrics, tracing, access logs)

Sets proper gRPC authority from Service/Backend metadata and enables TLS for Backend resources used by telemetry (metrics, tracing, access logs).

This allows direct use of cloud OTLP endpoints like Elastic Cloud, Datadog, etc. without an intermediate collector.

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Route backends need AutoSNI to use the Host header for SNI. Telemetry
backends (metrics, tracing, access logs) have no Host header and need
SNI inferred from the FQDN endpoint.

Move SNI inference to the telemetry-specific code in processBackendRefs
so route backends get AutoSNI while telemetry backends get inferred SNI.

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Adrian Cole <adrian@tetrate.io>
codefromthecrypt and others added 2 commits January 14, 2026 17:15
Signed-off-by: Adrian Cole <adrian@tetrate.io>
@codefromthecrypt
Copy link
Contributor Author

adding a preparatory PR to aigw in case we can get this merged at least I can test off main branch all things work envoyproxy/ai-gateway#1774

@zirain zirain merged commit 001137e into envoyproxy:main Jan 15, 2026
36 checks passed
return rate
}

// getAuthorityFromDestination extracts the gRPC authority from a destination setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhaohuabing @zirain what about a BTLSP attaching to a Service

SadmiB pushed a commit to SadmiB/gateway that referenced this pull request Jan 30, 2026
…nvoyproxy#7904)

* feat(telemetry): set authority and enable TLS for Backend resources (metrics, tracing, access logs)

Sets proper gRPC authority from Service/Backend metadata and enables TLS for Backend resources used by telemetry (metrics, tracing, access logs).

This allows direct use of cloud OTLP endpoints like Elastic Cloud, Datadog, etc. without an intermediate collector.

Signed-off-by: Adrian Cole <adrian@tetrate.io>

* feedback

Signed-off-by: Adrian Cole <adrian@tetrate.io>

* fix: move SNI inference to telemetry-specific code path

Route backends need AutoSNI to use the Host header for SNI. Telemetry
backends (metrics, tracing, access logs) have no Host header and need
SNI inferred from the FQDN endpoint.

Move SNI inference to the telemetry-specific code in processBackendRefs
so route backends get AutoSNI while telemetry backends get inferred SNI.

Signed-off-by: Adrian Cole <adrian@tetrate.io>

* feedback

Signed-off-by: Adrian Cole <adrian@tetrate.io>

* update-comment

Signed-off-by: Adrian Cole <adrian@tetrate.io>

---------

Signed-off-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Sadmi Bouhafs <sadmibouhafs@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants