[Inference] Instrument inference with OpenTelemetry by dgieselaar · Pull Request #218694 · elastic/kibana

dgieselaar · 2025-04-19T13:58:32Z

Instrument the inference chatComplete API with OpenTelemetry, and export helper functions to create spans w/ the right semconv attributes. Additionally, optionally export to Langfuse or Phoenix.

Centralizes OpenTelemetry setup

As this is the first instance of OpenTelemetry based tracing (we already have metrics in the MonitoringCollection plugin), some bootstrapping code is necessary to centrally configure OpenTelemetry. To this end, I've added the following config settings:

telemetry.tracing.enabled: whether OpenTelemetry tracing is enabled (defaults to undefined, if undefined, falls back to telemetry.enabled)
telemetry.tracing.sample_rate (defaults to 1)

The naming of these configuration settings is mostly in-line with the Elasticsearch tracing settings.

The following packages (containing bootstrapping logic, utility functions, types and config schemas) were added:

@kbn/telemetry
@kbn/telemetry-config
@kbn/tracing

The OpenTelemetry bootstrapping depends on @kbn/apm-config-loader, as it has the same constraints - it needs to run before any other code, and it needs to read the raw config.

Additionally, a root telemetry logger was added that captures OpenTelemetry logs.

Note that there is no default exporter for spans, which means that although spans are being recorded, they do not get exported.

Instrument chatComplete calls

Calls to chatComplete now create OpenTelemetry spans, roughly following semantic conventions (which for GenAI are very much in flux). Some helper functions were added to create other inference spans. These helper functions use baggage to determine whether the created inference span is the "root" of an inference trace. This allows us to export these spans as if it were root spans - something that is needed to be able to easily visualize these in other tools.

Leveraging these inference spans, two exporters are added. One for Phoenix and one for Langfuse: two open-source LLM Observability suites. This allows engineers that use the Inference plugin to be able to inspect and improve their LLM-based workflows with much less effort.

For both Phoenix and Langfuse, two service scripts were added. Run node scripts/phoenix or node scripts/langfuse to get started. Both scripts work with zero-config - they will log generated Kibana config to stdout.

consulthys

LGT Stack Monitoring

vigneshshanmugam · 2025-05-02T22:27:27Z

src/platform/packages/shared/kbn-telemetry/src/init_telemetry.ts

+ * @param serviceName         The service name used in resource attributes
+ * @returns                   A function that can be called on shutdown and allows exporters to flush their queue.
+ */
+export const initTelemetry = (


@dgieselaar Is there a problem you see when we have both OTel SDK and APM Node.js agent active at the same time? Should we set active flag to false when we enable OTel SDK?

All auto-instrumentations for @opentelemetry should be disabled. Only spans in the context of Inference calls (chatComplete) should be recorded and exported, which are relatively low in volume. The elasticsearch-js client always creates spans via Transport, so those will be recorded as well - kind of accidentally, I'm not sure that's how it should work, but in this case also convenient. So they should be able to run in parallel without any problems, but if we migrate more stuff to OTel I think we should consider one or the other.

Gotcha, thanks for the context and agree we need to move towards the OTel SDK soonish.

dgieselaar · 2025-05-03T07:50:08Z

@elasticmachine merge upstream

qn895

LGTM 🎉

…t-langfuse

elasticmachine · 2025-05-07T08:44:23Z

⏳ Build in-progress

Buildkite Build
Commit: 640b290
Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-218694-640b290e480d

History

💚 Build #298425 succeeded ab27e33
💔 Build #298398 failed 0ab26d4
💚 Build #298238 succeeded 76817df
💛 Build #297689 was flaky ebad62b
💔 Build #297663 failed 108cd1f
💔 Build #297087 failed b19388a

kibanamachine · 2025-05-07T09:44:48Z

Starting backport for target branches: 8.19

https://github.com/elastic/kibana/actions/runs/14880279077

kibanamachine · 2025-05-07T09:45:56Z

💔 Backport failed

The pull request could not be backported due to the following error:
Git clone failed with exit code: 128

Manual backport

To create the backport manually run:

node scripts/backport --pr 218694

Questions ?

Please refer to the Backport tool documentation

Instrument the inference chatComplete API with OpenTelemetry, and export helper functions to create spans w/ the right semconv attributes. Additionally, optionally export to Langfuse or Phoenix. As this is the first instance of OpenTelemetry based _tracing_ (we already have metrics in the MonitoringCollection plugin), some bootstrapping code is necessary to centrally configure OpenTelemetry. To this end, I've added the following config settings: - `telemetry.tracing.enabled`: whether OpenTelemetry tracing is enabled (defaults to undefined, if undefined, falls back to `telemetry.enabled`) - `telemetry.tracing.sample_rate` (defaults to 1) The naming of these configuration settings is mostly in-line with [the Elasticsearch tracing settings](https://github.com/elastic/elasticsearch/blob/main/TRACING.md). The following packages (containing bootstrapping logic, utility functions, types and config schemas) were added: - `@kbn/telemetry` - `@kbn/telemetry-config` - `@kbn/tracing` The OpenTelemetry bootstrapping depends on @kbn/apm-config-loader, as it has the same constraints - it needs to run before any other code, and it needs to read the raw config. Additionally, a root `telemetry` logger was added that captures OpenTelemetry logs. Note that there is no default exporter for spans, which means that although spans are being recorded, they do not get exported. Calls to `chatComplete` now create OpenTelemetry spans, roughly following semantic conventions (which for GenAI are very much in flux). Some helper functions were added to create other inference spans. These helper functions use baggage to determine whether the created inference span is the "root" of an inference trace. This allows us to export these spans as if it were root spans - something that is needed to be able to easily visualize these in other tools. Leveraging these inference spans, two exporters are added. One for [Phoenix](https://github.com/Arize-ai/phoenix) and one for [Langfuse](https://github.com/langfuse/langfuse/tree/main): two open-source LLM Observability suites. This allows engineers that use the Inference plugin to be able to inspect and improve their LLM-based workflows with much less effort. For both Phoenix and Langfuse, two service scripts were added. Run `node scripts/phoenix` or `node scripts/langfuse` to get started. Both scripts work with zero-config - they will log generated Kibana config to stdout. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 2387e3b)

Instrument the inference chatComplete API with OpenTelemetry, and export helper functions to create spans w/ the right semconv attributes. Additionally, optionally export to Langfuse or Phoenix. ## Centralizes OpenTelemetry setup As this is the first instance of OpenTelemetry based _tracing_ (we already have metrics in the MonitoringCollection plugin), some bootstrapping code is necessary to centrally configure OpenTelemetry. To this end, I've added the following config settings: - `telemetry.tracing.enabled`: whether OpenTelemetry tracing is enabled (defaults to undefined, if undefined, falls back to `telemetry.enabled`) - `telemetry.tracing.sample_rate` (defaults to 1) The naming of these configuration settings is mostly in-line with [the Elasticsearch tracing settings](https://github.com/elastic/elasticsearch/blob/main/TRACING.md). The following packages (containing bootstrapping logic, utility functions, types and config schemas) were added: - `@kbn/telemetry` - `@kbn/telemetry-config` - `@kbn/tracing` The OpenTelemetry bootstrapping depends on @kbn/apm-config-loader, as it has the same constraints - it needs to run before any other code, and it needs to read the raw config. Additionally, a root `telemetry` logger was added that captures OpenTelemetry logs. Note that there is no default exporter for spans, which means that although spans are being recorded, they do not get exported. ## Instrument chatComplete calls Calls to `chatComplete` now create OpenTelemetry spans, roughly following semantic conventions (which for GenAI are very much in flux). Some helper functions were added to create other inference spans. These helper functions use baggage to determine whether the created inference span is the "root" of an inference trace. This allows us to export these spans as if it were root spans - something that is needed to be able to easily visualize these in other tools. Leveraging these inference spans, two exporters are added. One for [Phoenix](https://github.com/Arize-ai/phoenix) and one for [Langfuse](https://github.com/langfuse/langfuse/tree/main): two open-source LLM Observability suites. This allows engineers that use the Inference plugin to be able to inspect and improve their LLM-based workflows with much less effort. For both Phoenix and Langfuse, two service scripts were added. Run `node scripts/phoenix` or `node scripts/langfuse` to get started. Both scripts work with zero-config - they will log generated Kibana config to stdout. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit 2387e3b) # Conflicts: # .github/CODEOWNERS # renovate.json # src/cli/tsconfig.json # src/platform/plugins/shared/telemetry/server/config/config.ts # x-pack/platform/plugins/shared/observability_ai_assistant/server/service/client/index.ts # yarn.lock

dgieselaar · 2025-05-07T10:09:02Z

💚 All backports created successfully

Status	Branch	Result
✅	8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine · 2025-05-09T09:48:43Z

Looks like this PR has a backport PR but it still hasn't been merged. Please merge it ASAP to keep the branches relatively in sync.
cc: @dgieselaar

…220349) # Backport This will backport the following commits from `main` to `8.19`: - [[Inference] Instrument inference with OpenTelemetry (#218694)](#218694)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)  --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

Instrument the inference chatComplete API with OpenTelemetry, and export helper functions to create spans w/ the right semconv attributes. Additionally, optionally export to Langfuse or Phoenix. ## Centralizes OpenTelemetry setup As this is the first instance of OpenTelemetry based _tracing_ (we already have metrics in the MonitoringCollection plugin), some bootstrapping code is necessary to centrally configure OpenTelemetry. To this end, I've added the following config settings: - `telemetry.tracing.enabled`: whether OpenTelemetry tracing is enabled (defaults to undefined, if undefined, falls back to `telemetry.enabled`) - `telemetry.tracing.sample_rate` (defaults to 1) The naming of these configuration settings is mostly in-line with [the Elasticsearch tracing settings](https://github.com/elastic/elasticsearch/blob/main/TRACING.md). The following packages (containing bootstrapping logic, utility functions, types and config schemas) were added: - `@kbn/telemetry` - `@kbn/telemetry-config` - `@kbn/tracing` The OpenTelemetry bootstrapping depends on @kbn/apm-config-loader, as it has the same constraints - it needs to run before any other code, and it needs to read the raw config. Additionally, a root `telemetry` logger was added that captures OpenTelemetry logs. Note that there is no default exporter for spans, which means that although spans are being recorded, they do not get exported. ## Instrument chatComplete calls Calls to `chatComplete` now create OpenTelemetry spans, roughly following semantic conventions (which for GenAI are very much in flux). Some helper functions were added to create other inference spans. These helper functions use baggage to determine whether the created inference span is the "root" of an inference trace. This allows us to export these spans as if it were root spans - something that is needed to be able to easily visualize these in other tools. Leveraging these inference spans, two exporters are added. One for [Phoenix](https://github.com/Arize-ai/phoenix) and one for [Langfuse](https://github.com/langfuse/langfuse/tree/main): two open-source LLM Observability suites. This allows engineers that use the Inference plugin to be able to inspect and improve their LLM-based workflows with much less effort. For both Phoenix and Langfuse, two service scripts were added. Run `node scripts/phoenix` or `node scripts/langfuse` to get started. Both scripts work with zero-config - they will log generated Kibana config to stdout. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

dgieselaar force-pushed the inference-output-langfuse branch 10 times, most recently from f3ab323 to be2ebc6 Compare April 26, 2025 09:00

dgieselaar force-pushed the inference-output-langfuse branch 2 times, most recently from adbf4b7 to 90412a2 Compare April 27, 2025 14:32

dgieselaar added backport:version Backport to applied version labels v9.1.0 v8.19.0 release_note:skip Skip the PR/issue when compiling release notes labels Apr 27, 2025

[Inference] Trace inference spans with OTel, export to Langfuse

9f5c364

dgieselaar force-pushed the inference-output-langfuse branch from 90412a2 to 9f5c364 Compare April 27, 2025 14:34

dgieselaar marked this pull request as ready for review April 27, 2025 14:34

dgieselaar requested review from a team, kibanamachine and vigneshshanmugam as code owners April 27, 2025 14:34

consulthys approved these changes Apr 27, 2025

View reviewed changes

botelastic bot added ci:project-deploy-observability Create an Observability project Team:Obs AI Assistant Observability AI Assistant labels Apr 27, 2025

vigneshshanmugam reviewed May 2, 2025

View reviewed changes

elasticmachine and others added 2 commits May 3, 2025 09:50

Merge branch 'main' into inference-output-langfuse

0ab26d4

Use http/proto for phoenix instead of grpc

ab27e33

afharo approved these changes May 6, 2025

View reviewed changes

qn895 approved these changes May 6, 2025

View reviewed changes

dgieselaar added 3 commits May 7, 2025 09:16

Merge branch 'main' of github.com:elastic/kibana into inference-outpu…

6138009

…t-langfuse

Address review feedback

7eb74af

Update CODEOWNERS

640b290

dgieselaar merged commit 2387e3b into elastic:main May 7, 2025
11 checks passed

dgieselaar mentioned this pull request May 7, 2025

[8.19] [Inference] Instrument inference with OpenTelemetry (#218694) #220349

Merged

kibanamachine added the backport missing Added to PRs automatically when the are determined to be missing a backport. label May 9, 2025

kibanamachine removed the backport missing Added to PRs automatically when the are determined to be missing a backport. label May 9, 2025

azasypkin mentioned this pull request Jun 23, 2025

Load huggingface content datasets #224543

Merged

dgieselaar mentioned this pull request Jun 27, 2025

feat(js): Add OTEL support langchain-ai/langsmith-sdk#1814

Merged

mikeldking mentioned this pull request Jul 4, 2025

Is it possible to use Phoenix in ElasticSearch? Arize-ai/phoenix#8419

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference] Instrument inference with OpenTelemetry#218694

[Inference] Instrument inference with OpenTelemetry#218694
dgieselaar merged 18 commits intoelastic:mainfrom
dgieselaar:inference-output-langfuse

dgieselaar commented Apr 19, 2025 •

edited by kibanamachine

Loading

Uh oh!

consulthys left a comment

Uh oh!

vigneshshanmugam May 2, 2025

Uh oh!

dgieselaar May 3, 2025

Uh oh!

vigneshshanmugam May 5, 2025

Uh oh!

dgieselaar commented May 3, 2025

Uh oh!

qn895 left a comment

Uh oh!

elasticmachine commented May 7, 2025

Uh oh!

Uh oh!

kibanamachine commented May 7, 2025

Uh oh!

kibanamachine commented May 7, 2025

Uh oh!

dgieselaar commented May 7, 2025

Uh oh!

kibanamachine commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

dgieselaar commented Apr 19, 2025 • edited by kibanamachine Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Centralizes OpenTelemetry setup

Instrument chatComplete calls

Uh oh!

consulthys left a comment

Choose a reason for hiding this comment

Uh oh!

vigneshshanmugam May 2, 2025

Choose a reason for hiding this comment

Uh oh!

dgieselaar May 3, 2025

Choose a reason for hiding this comment

Uh oh!

vigneshshanmugam May 5, 2025

Choose a reason for hiding this comment

Uh oh!

dgieselaar commented May 3, 2025

Uh oh!

qn895 left a comment

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented May 7, 2025

⏳ Build in-progress

History

Uh oh!

Uh oh!

kibanamachine commented May 7, 2025

Uh oh!

kibanamachine commented May 7, 2025

💔 Backport failed

Manual backport

Questions ?

Uh oh!

dgieselaar commented May 7, 2025

💚 All backports created successfully

Questions ?

Uh oh!

kibanamachine commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

dgieselaar commented Apr 19, 2025 •

edited by kibanamachine

Loading