Skip to content

Migrate server-side apm.addLabels to OTel dual-write helpers#259619

Merged
Bamieh merged 11 commits intomainfrom
otel-labels
Apr 2, 2026
Merged

Migrate server-side apm.addLabels to OTel dual-write helpers#259619
Bamieh merged 11 commits intomainfrom
otel-labels

Conversation

@Bamieh
Copy link
Copy Markdown
Contributor

@Bamieh Bamieh commented Mar 25, 2026

Summary

Creates dual-write helper functions (addSpanLabels, addTransactionLabels) in @kbn/apm-utils that write labels to both APM and OpenTelemetry simultaneously, then migrates all actionable server-side addLabels call sites to use these helpers.

This ensures contextual metadata (execution context, rule IDs, connector types, feature flags, etc.) is preserved regardless of which telemetry backend is active. The change is purely additive — existing APM labels continue to work unchanged.

What changed

  • New helpers in @kbn/apm-utils: addSpanLabels and addTransactionLabels that dual-write to APM addLabels and OTel trace.getActiveSpan()?.setAttributes(). OTel attributes are auto-prefixed with kibana., with an otelAttributes escape hatch for custom attribute names.
  • withSpan fix: Now dual-writes the labels option to OTel span attributes even when APM IS started (previously OTel labels were only written when APM was off).
  • 18 call sites migrated across core, platform, and x-pack plugins (items 2-20 from the design doc, excluding apm_tracer.ts).
  • root/index.ts unchanged (item 1) — dynamic global labels remain APM-only since OTel resources are immutable after provider creation.
  • telemetry_collection_manager/plugin.ts — labels removed since they were already set via withActiveSpan attributes on span creation.

Closes #224830

Checklist

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
  • Flaky Test Runner was used on any tests changed
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines
  • Review the backport guidelines and apply applicable backport:* labels.

Identify risks

  • Double-write overhead (negligible): setAttributes is a cheap in-memory operation. Both agents are already loaded.

@Bamieh Bamieh requested review from a team as code owners March 25, 2026 16:43
@Bamieh Bamieh added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting labels Mar 25, 2026
@Bamieh Bamieh requested a review from a team as a code owner March 25, 2026 16:43
@Bamieh Bamieh added the v9.4.0 label Mar 25, 2026
@Bamieh Bamieh requested review from a team as code owners March 25, 2026 16:43
@botelastic botelastic bot added Team:Fleet Team label for Observability Data Collection Fleet team Team:obs-presentation Focus: APM UI, Infra UI, Hosts UI, Universal Profiling, Obs Overview and left Navigation Team:One Workflow Team label for One Workflow (Workflow automation) labels Mar 25, 2026
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/fleet (Team:Fleet)

@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/obs-presentation-team (Team:obs-presentation)

{
attributes: {
'span.type': type,
...(labels ? prefixKeys(labels, 'kibana.') : {}),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@Bamieh Bamieh requested review from a team as code owners March 25, 2026 17:21
Copy link
Copy Markdown
Contributor

@nkhristinin nkhristinin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DE changes LGTM, code review only

@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

@AlexanderWert
Copy link
Copy Markdown
Member

AlexanderWert commented Mar 27, 2026

Just wanted to call out for awareness regarding the dual-write helpers.

When adding a label foo.bar as a classic APM label, the APM Server modified and stored it as the following field: labels.foo_bar. With the OTel attributes foo.bar will actually be attributes.foo.bar with an implicit alias that is foo.bar.

See also:
https://fictional-chainsaw-gz3er9g.pages.github.io/migration/migration-apm/#migrating-usage-of-labels

So if the goal is to have 100% backwards compatibility with existing queries, etc. that would use these attributes an option could be to transform foo.bar to labels.foo_bar in that helper before it's being written into OTel span attributes.

The alternative is, to keep it as is and then to adjust all the existing assets that rely on labels.

Copy link
Copy Markdown
Contributor

@maximpn maximpn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule Management changes LGTM

@Bamieh
Copy link
Copy Markdown
Contributor Author

Bamieh commented Mar 27, 2026

@AlexanderWert Thanks for the input. we are moving away from APM totally so i would much rather have the nesting attributes.foo.bar as OTEL woudl do and not follow the APM implicit transformation. So i want to keep the implementation as is right now and doing a little work now to adjusting all the existing assets that rely on labels as a path to move forward with this (since we dont have that many anyways, and we'll need to adjust to OTEL reports in any case).

Bamieh and others added 4 commits March 27, 2026 19:49
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Mar 27, 2026

Approvability

Verdict: Needs human review

This PR introduces new OTel dual-write helpers and migrates ~15 call sites across multiple team-owned packages. While the migration pattern is consistent, it adds new runtime behavior (OTel span attributes) and touches code owned by ~10 different teams - none of which the author owns. The designated code owners should review changes to their observability instrumentation.

No code changes detected at 1a1a5f5. Prior analysis still applies.

You can customize Macroscope's approvability policy. Learn more.

Copy link
Copy Markdown
Contributor

@shahargl shahargl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested manually, lgtm

Copy link
Copy Markdown
Contributor

@rmyz rmyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #58 / Agent Builder sidebar Sidebar Conversation Flow sends a message and receives a response

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/apm-utils 15 22 +7
Unknown metric groups

API count

id before after diff
@kbn/apm-utils 16 23 +7

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Fleet Team label for Observability Data Collection Fleet team Team:obs-presentation Focus: APM UI, Infra UI, Hosts UI, Universal Profiling, Obs Overview and left Navigation Team:One Workflow Team label for One Workflow (Workflow automation) Team:Security Generative AI Security Generative AI v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Server-side OTel] Revisit apm.addLabel to use OTel attributes