Skip to content

Agent builder exporter#265290

Merged
machadoum merged 21 commits into
elastic:mainfrom
machadoum:ea-ws-traces-exporter
May 6, 2026
Merged

Agent builder exporter#265290
machadoum merged 21 commits into
elastic:mainfrom
machadoum:ea-ws-traces-exporter

Conversation

@machadoum
Copy link
Copy Markdown
Member

@machadoum machadoum commented Apr 23, 2026

Depends on elastic/elasticsearch#147811

Resolves https://github.com/elastic/search-team/issues/14191
Resolves https://github.com/elastic/search-team/issues/14190

Summary

This PR adds a dedicated OpenTelemetry trace export path for Agent Builder inference spans, so they can land in Elasticsearch under a distinct dataset (agent_builder) while reusing Kibana’s Elasticsearch connection (auth, TLS, transport). Generic tracing can remain sampled down without silently dropping inference work: inference spans identified via kibana.inference.tracing baggage are preserved through sampling so downstream processors can export them on a copy.

Why: Agent Builder observability needs reliable inference-span export and routing into its own traces data stream, aligned with Elasticsearch’s native OTLP traces ingestion (/_otlp/v1/traces from elastic/elasticsearch#147811).

Architecture

  • @kbn/tracingInferencePreservingSampler wraps the existing ParentBasedSampler in init_tracing.ts. Non-inference spans pass through unchanged. Inference spans upgrade NOT_RECORD to RECORD (without forcing SAMPLED) so domain processors can clone and set SAMPLED for their pipeline.
  • @kbn/inference-tracingElasticsearchOtlpExporter serializes spans with @opentelemetry/otlp-transformer and POSTs OTLP-protobuf to ES /_otlp/v1/traces via the ES client transport (same connection settings as Kibana).
  • should_track_span.ts / isInferenceSpan() extracts “should track” logic from BaseInferenceSpanProcessor.onStart; shared by BaseInferenceSpanProcessor and Agent Builder.
  • agent_builderAgentBuilderSpanProcessor copies eligible spans, forces SAMPLED on the copy for export, adds data_stream.dataset: agent_builder, and feeds a BatchSpanProcessor. Enabled state comes from an LRU-backed saved-objects check against AGENT_BUILDER_EXPERIMENTAL_FEATURES_SETTING_ID.
  • register_tracing.ts chooses OTLPTraceExporter when a trace URL is configured, otherwise ElasticsearchOtlpExporter; registers via LateBindingSpanProcessor.register().
  • Lifecycle: exporter registration in plugin.ts start(), async teardown in stop().
flowchart LR
  subgraph tracing["Global tracing"]
    S["InferencePreservingSampler"]
    P["Span processors"]
  end
  subgraph ab["Agent Builder"]
    AB["AgentBuilderSpanProcessor"]
    BSP["BatchSpanProcessor"]
    E["OTLP URL exporter OR ElasticsearchOtlpExporter"]
  end
  S --> P
  P --> AB
  AB --> BSP --> E --> ES["Elasticsearch traces"]
Loading

Package exports

  • @kbn/tracing: InferencePreservingSampler (wired in init_tracing.ts).
  • @kbn/inference-tracing: ElasticsearchOtlpExporter, isInferenceSpan / should_track_span helpers (via new module), existing processors updated to use shared inference detection.

How to test

  1. Make sure your ES instance has the otlp endpoint enabled Enable OTLP logs and traces by default elasticsearch#147811
  2. Enable agent builder experimental setting agentBuilder:experimentalFeatures
  3. Enable Kibana tracing telemetry.tracing.enabled: false
  4. Run a query through Agent Builder.
  5. Confirm Agent Builder spans exist under .ds-traces-agent_builder*.

If the evals plugin is enabled xpack.evals.enabled: true you will see a view traces button in agent builder the reasoning panel.

Screenshot 2026-04-30 at 11 02 04 Screenshot 2026-04-30 at 11 02 06

Checklist

Reviewers should verify this PR satisfies this list as well.

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support (no product UI strings in this PR — server tracing/config only)
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios (not in this PR yet — follow-up)
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list (agent_builder.tracing.* added — cloud/docker follow-up required before merge)
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations. (no breaking public HTTP API changes)
  • Flaky Test Runner was used on any tests changed (no tests changed in this PR yet)
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines
  • Review the backport guidelines and apply applicable backport:* labels. backport:skip — new feature; no backport planned via this PR.

Third-party Dependency

Purpose: Serializes ReadableSpan[] into OTLP-protobuf binary format (ProtobufTraceSerializer.serializeRequest()) so the ElasticsearchOtlpExporter can POST spans to ES's /_otlp/v1/traces via the ES client transport — no separate OTLP collector needed.

Justification: The existing OTLP exporters (exporter-trace-otlp-proto) bundle their own HTTP transport and can't route through the ES client. We need the serialization layer standalone to reuse Kibana's ES connection (auth, TLS).

Alternatives explored:

  • Use exporter-trace-otlp-proto directly: Can't — it owns its HTTP connection and can't use the ES client transport. We do use it for the external OTLP URL path; the ES path needs standalone serialization.
  • Implement serialization manually: OTLP-protobuf encoding is non-trivial (protobuf schema, resource/scope/span mapping, attribute encoding). Fragile and would drift from the spec.

Existing dependencies: Already a direct dep in root package.json (0.214.0) and resolved in yarn.lock (3 versions). Transitively pulled by exporter-trace-otlp-proto, exporter-trace-otlp-http, exporter-logs-otlp-*, otlp-exporter-base, and sdk-node. No new package enters node_modules — this PR just adds a direct import from kbn-inference-tracing.

Identify risks

Risk Severity Mitigation
Global sampling interactionInferencePreservingSampler changes when inference spans are recorded vs dropped relative to parent-based sampling. Medium Scoped to baggage-marked inference spans; non-inference spans unchanged. Review trace volume and cardinality in staging; validate alongside inference and platform tracing owners.
Elasticsearch OTLP dependency — native /_otlp/v1/traces must be available and compatible for the fallback exporter path; misconfiguration could mean lost or failed exports. Medium Depends on elastic/elasticsearch#147811; test both OTLP URL and ES-transport paths; monitor exporter errors and ES responses.

@machadoum machadoum force-pushed the ea-ws-traces-exporter branch 2 times, most recently from d744ec0 to b19fad7 Compare April 24, 2026 14:14
@machadoum machadoum force-pushed the ea-ws-traces-exporter branch from b19fad7 to 2d834b8 Compare April 24, 2026 15:06
@machadoum machadoum requested a review from trentm April 28, 2026 08:55
Copy link
Copy Markdown
Member

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@machadoum From only reading the code (I haven't run this) I think this looks good.

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-proto';
import { isInInferenceContext } from '../is_in_inference_context';

const SHOULD_TRACK_ATTR = '_ab_should_track';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This span attribute is dropped before export for this span processor.
Note that the other span processors (and their exported spans) will have this span attribute.
That's probably fine, but will you want a more self-explanatory attribute name?
Also perhaps name using dot-separators as is more common in OTel usage.
Is there a naming pattern from other span attributes added by other in-Kibana self-instrumentation?

If this attribute really isn't wanted, then you would need something like a custom SpanProcessor that wrapped all the other processors added via LateBindingSpanProcessor.get().register(...) to have them drop the attribute.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I shamelessly copied this code from the base_inference_span_processor.ts.

if (shouldTrack) {
span.setAttribute('_should_track', true);
this.delegate.onStart(span, parentContext);
}

Could we address it on another PR, since it already affects other span processors.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with it. But I'm not a privileged reviewer in this repo. :)

Copy link
Copy Markdown
Member

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OTel Node.js usage looks sane to me.

Comment thread x-pack/platform/plugins/shared/agent_builder/server/plugin.ts Outdated
@machadoum machadoum added backport:skip This PR does not require backporting release_note:feature Makes this part of the condensed release notes Team:agent-builder feature:agent-builder Identify agent builder functionalities to be grouped together for release notes labels Apr 30, 2026
@machadoum machadoum self-assigned this Apr 30, 2026
@machadoum machadoum removed the release_note:feature Makes this part of the condensed release notes label Apr 30, 2026
Comment thread x-pack/platform/plugins/shared/agent_builder/server/tracing/register_tracing.ts Outdated
@machadoum machadoum marked this pull request as ready for review April 30, 2026 11:58
@machadoum machadoum requested review from a team as code owners April 30, 2026 11:58
Comment thread package.json Outdated
"@opentelemetry/instrumentation-http": "0.214.0",
"@opentelemetry/instrumentation-undici": "0.24.0",
"@opentelemetry/otlp-exporter-base": "0.214.0",
"@opentelemetry/otlp-transformer": "0.214.0",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a note that this dependency is intended for internal use only, https://www.npmjs.com/package/@opentelemetry/otlp-transformer.

Might be worth probing out whether this is an actual risk or we can expect this to stabilize?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trentm Do you think that using otlp-transformer for serializing the spam is a risk?

    const serialized = ProtobufTraceSerializer.serializeRequest(spans);

https://github.com/elastic/kibana/pull/265290/changes#diff-aebb43311f0beef2700524e5d3da07aeef6950094a687c9a48b1241aa2d1d0beR37-R38

There is no public alternative — OTel JS provides no other way to serialize ReadableSpan[] into OTLP protobuf/JSON. The only options are: use this package, or hand-roll protobuf serialization. Which sounds like a much worse alternative. And very OTLP exporter in the OTel JS ecosystem depends on it — exporter-trace-otlp-proto, exporter-trace-otlp-http, exporter-logs-otlp-grpc, otlp-exporter-base, sdk-node.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTel JS is just not fully at 1.x (aka "stable") yet. One component that isn't yet "stable" is its OTLP exporters (of which the otlp-transformer package is a part).

No, I don't think having otlp-transformer as a transitive dep is a "risk".
These packages are what every user of OTel JS is using to export OTLP data.

However, why was this dep explicitly added? I don't see it used explicitly anywhere.
I see export { ElasticsearchOtlpExporter } from './src/elasticsearch_otlp_exporter'; in the PR. Is there a new "elasticsearch_otlp_exporter.ts" file that hasn't been added to this PR?

Note that the require buildkite check is failing.

Copy link
Copy Markdown
Member Author

@machadoum machadoum May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, why was this dep explicitly added? I don't see it used explicitly anywhere.

Sorry hooky mistake. I failed to push the file after it was moved during a code review improvement. Sean also caught the same problem here: #265290 (comment)

If you refresh the page it will be there.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see.

As @SrdjanLL points out, otlp-transformer is primarily written as an internal package that is shared between OTel JS's OTLP exporters for the different signals (traces, metrics, logs). Its interface won't change without a noted breaking change. Because it is still 0.x, per semver, that "breaking" change version will be a new minor. So, for example, 0.214.0 -> 0.215.0 will potentially be breaking and you'd need to watch for that. Type changes, if any, would most likely automatically point out a breaking issue.

@machadoum machadoum requested review from a team as code owners April 30, 2026 15:10
@botelastic botelastic Bot added the Team:One Workflow Team label for One Workflow (Workflow automation) label Apr 30, 2026
Copy link
Copy Markdown
Contributor

@dmlemeshko dmlemeshko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-pack/platform/test/tsconfig.json changes LGTM

Comment thread x-pack/platform/plugins/shared/agent_builder/tsconfig.json
Copy link
Copy Markdown
Member

@qn895 qn895 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Infra LGTM

Comment thread x-pack/platform/plugins/shared/agent_builder/server/tracing/register_tracing.ts Outdated
Comment thread x-pack/platform/plugins/shared/agent_builder/server/tracing/register_tracing.ts Outdated
Comment thread x-pack/platform/plugins/shared/agent_builder/server/config.ts Outdated
Comment thread src/platform/packages/shared/kbn-tracing/index.ts
@machadoum machadoum removed the request for review from a team May 1, 2026 12:46
Copy link
Copy Markdown
Contributor

@elena-shostak elena-shostak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@opentelemetry/otlp-transformer dependency LGTM

Copy link
Copy Markdown
Member

@seanstory seanstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!


/**
* A {@link tracing.SpanExporter} that ships OTLP-protobuf encoded spans
* to Elasticsearch's native `/_otlp/v1/traces` endpoint via the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew about /_otlp/v1/metrics. Is traces now supported as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was recently merged together with logs elastic/elasticsearch#147811

@machadoum machadoum enabled auto-merge (squash) May 6, 2026 12:11
@kibanamachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #2 / useWorkflowExecutions should pass executionTypes filter to query params

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/inference-tracing 25 30 +5
@kbn/tracing 21 28 +7
total +12
Unknown metric groups

API count

id before after diff
@kbn/inference-tracing 29 34 +5
@kbn/tracing 23 31 +8
total +13

History

cc @machadoum

@machadoum machadoum merged commit 5c868d2 into elastic:main May 6, 2026
33 checks passed
ersin-erdal pushed a commit to ersin-erdal/kibana that referenced this pull request May 6, 2026
Depends on elastic/elasticsearch#147811


Resolves elastic/search-team#14191
Resolves elastic/search-team#14190


## Summary

This PR adds a dedicated OpenTelemetry trace export path for **Agent
Builder inference spans**, so they can land in Elasticsearch under a
distinct dataset (`agent_builder`) while reusing Kibana’s Elasticsearch
connection (auth, TLS, transport). Generic tracing can remain sampled
down without silently dropping inference work: inference spans
identified via `kibana.inference.tracing` baggage are preserved through
sampling so downstream processors can export them on a copy.

**Why:** Agent Builder observability needs reliable inference-span
export and routing into its own traces data stream, aligned with
Elasticsearch’s native OTLP traces ingestion (`/_otlp/v1/traces` from
elastic/elasticsearch#147811).

## Architecture

- **`@kbn/tracing` — `InferencePreservingSampler`** wraps the existing
`ParentBasedSampler` in `init_tracing.ts`. Non-inference spans pass
through unchanged. Inference spans upgrade `NOT_RECORD` to `RECORD`
(without forcing `SAMPLED`) so domain processors can clone and set
`SAMPLED` for their pipeline.
- **`@kbn/inference-tracing` — `ElasticsearchOtlpExporter`** serializes
spans with `@opentelemetry/otlp-transformer` and POSTs OTLP-protobuf to
ES `/_otlp/v1/traces` via the ES client transport (same connection
settings as Kibana).
- **`should_track_span.ts` / `isInferenceSpan()`** extracts “should
track” logic from `BaseInferenceSpanProcessor.onStart`; shared by
`BaseInferenceSpanProcessor` and Agent Builder.
- **`agent_builder` — `AgentBuilderSpanProcessor`** copies eligible
spans, forces `SAMPLED` on the copy for export, adds
`data_stream.dataset: agent_builder`, and feeds a `BatchSpanProcessor`.
Enabled state comes from an LRU-backed saved-objects check against
`AGENT_BUILDER_EXPERIMENTAL_FEATURES_SETTING_ID`.
- **`register_tracing.ts`** chooses **`OTLPTraceExporter`** when a trace
URL is configured, otherwise **`ElasticsearchOtlpExporter`**; registers
via `LateBindingSpanProcessor.register()`.
- **Lifecycle:** exporter registration in `plugin.ts` `start()`, async
teardown in `stop()`.

```mermaid
flowchart LR
  subgraph tracing["Global tracing"]
    S["InferencePreservingSampler"]
    P["Span processors"]
  end
  subgraph ab["Agent Builder"]
    AB["AgentBuilderSpanProcessor"]
    BSP["BatchSpanProcessor"]
    E["OTLP URL exporter OR ElasticsearchOtlpExporter"]
  end
  S --> P
  P --> AB
  AB --> BSP --> E --> ES["Elasticsearch traces"]
```


### Package exports

- **`@kbn/tracing`:** `InferencePreservingSampler` (wired in
`init_tracing.ts`).
- **`@kbn/inference-tracing`:** `ElasticsearchOtlpExporter`,
`isInferenceSpan` / `should_track_span` helpers (via new module),
existing processors updated to use shared inference detection.

## How to test

1. Make sure your ES instance has the otlp endpoint enabled
elastic/elasticsearch#147811
2. Enable agent builder experimental setting
`agentBuilder:experimentalFeatures`
3. Enable Kibana tracing `telemetry.tracing.enabled: false`
4. Run a query through Agent Builder.
5. Confirm Agent Builder spans exist under `.ds-traces-agent_builder*`.

If the evals plugin is enabled `xpack.evals.enabled: true` you will see
a view traces button in agent builder the reasoning panel.

<img width="787" height="277" alt="Screenshot 2026-04-30 at 11 02 04"
src="https://github.com/user-attachments/assets/9a891a63-436c-4302-af48-3027406c8d1f"
/>
<img width="624" height="805" alt="Screenshot 2026-04-30 at 11 02 06"
src="https://github.com/user-attachments/assets/d1493730-8ef4-4c66-a50d-8237a24d0180"
/>

### Checklist

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
_(no product UI strings in this PR — server tracing/config only)_
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios **(not in this
PR yet — follow-up)**
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
**(`agent_builder.tracing.*` added — cloud/docker follow-up required
before merge)**
- [x] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
_(no breaking public HTTP API changes)_
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed _(no tests changed in this PR yet)_
- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
- [x] Review the [backport
guidelines](https://docs.google.com/document/d/1VyN5k91e5OVumlc0Gb9RPa3h1ewuPE705nRtioPiTvY/edit?usp=sharing)
and apply applicable `backport:*` labels. **`backport:skip`** — new
feature; no backport planned via this PR.

### Third-party Dependency
**Purpose:** Serializes ReadableSpan[] into OTLP-protobuf binary format
(ProtobufTraceSerializer.serializeRequest()) so the
ElasticsearchOtlpExporter can POST spans to ES's /_otlp/v1/traces via
the ES client transport — no separate OTLP collector needed.

**Justification:** The existing OTLP exporters
(exporter-trace-otlp-proto) bundle their own HTTP transport and can't
route through the ES client. We need the serialization layer standalone
to reuse Kibana's ES connection (auth, TLS).

**Alternatives explored:**

* Use exporter-trace-otlp-proto directly: Can't — it owns its HTTP
connection and can't use the ES client transport. We do use it for the
external OTLP URL path; the ES path needs standalone serialization.
* Implement serialization manually: OTLP-protobuf encoding is
non-trivial (protobuf schema, resource/scope/span mapping, attribute
encoding). Fragile and would drift from the spec.

**Existing dependencies:** Already a direct dep in root package.json
(0.214.0) and resolved in yarn.lock (3 versions). Transitively pulled by
exporter-trace-otlp-proto, exporter-trace-otlp-http,
exporter-logs-otlp-*, otlp-exporter-base, and sdk-node. No new package
enters node_modules — this PR just adds a direct import from
kbn-inference-tracing.


### Identify risks

- [x] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)

| Risk | Severity | Mitigation |
| --- | --- | --- |
| **Global sampling interaction** — `InferencePreservingSampler` changes
when inference spans are recorded vs dropped relative to parent-based
sampling. | Medium | Scoped to baggage-marked inference spans;
non-inference spans unchanged. Review trace volume and cardinality in
staging; validate alongside inference and platform tracing owners. |
| **Elasticsearch OTLP dependency** — native `/_otlp/v1/traces` must be
available and compatible for the fallback exporter path;
misconfiguration could mean lost or failed exports. | Medium | Depends
on elastic/elasticsearch#147811; test both OTLP URL and ES-transport
paths; monitor exporter errors and ES responses. |

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting feature:agent-builder Identify agent builder functionalities to be grouped together for release notes release_note:skip Skip the PR/issue when compiling release notes Team:agent-builder Team:One Workflow Team label for One Workflow (Workflow automation) v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants