Skip to content

Otel agent distro diferentiation in APM telemetry collection#208770

Closed
marcogavaz wants to merge 9 commits intoelastic:mainfrom
marcogavaz:update-telemetry-collection
Closed

Otel agent distro diferentiation in APM telemetry collection#208770
marcogavaz wants to merge 9 commits intoelastic:mainfrom
marcogavaz:update-telemetry-collection

Conversation

@marcogavaz
Copy link
Contributor

@marcogavaz marcogavaz commented Jan 29, 2025

Summary

As requested in this comment and tracked by the issue #186281 this PR handles main changes introduced to address the request for capturing open-ended OTel distro agent names with patter opentelemetry/<LANGUAGE>/<DISTRO_NAME>. These changes ensure that new agent names won’t be dropped in telemetry.

The tasks that were changed are services and agents task.

Changes introduced in services task

Previously telemetry code enumerated OTel agents using a static array (OPEN_TELEMETRY_AGENT_NAMES) listing known agent names (like opentelemetry/java, opentelemetry/go, etc.), while now it starts with an empty object and dynamically add whatever agent names es query returns, so after retrieving the results, we let the data from the terms aggregator define which agent names exist, rather than forcing them to match any fixed array.

This is taken care mainly in

const servicesPerOtelAgents = await OPEN_TELEMETRY_BASE_AGENT_NAMES.reduce(
  (prevJob, baseAgentName) => {
    return prevJob.then(async (accData) => {
      const response = await telemetryClient.search({
        // same prefix query...
        aggs: {
          agent_name: {
            terms: {
              field: AGENT_NAME,
              size: 1000,
            },
            aggs: {
              services: {
                cardinality: {
                  field: SERVICE_NAME,
                },
              },
            },
          },
        },
      });

      const aggregatedServices: Record<string, number> = {};

      // For each agent_name bucket we actually find...
      for (const bucket of response.aggregations?.agent_name.buckets ?? []) {
        // could be "opentelemetry/java/elastic"
        aggregatedServices[bucket.key as string] = bucket.services.value || 0;
      }

      // merge into accumulated data
      return { ...accData, ...aggregatedServices };
    });
  },
  Promise.resolve({} as Record<string, number>)
);

so for example if a new agent.name like opentelemetry/java/elastic appears in the data, we automatically add it to aggregatedServices

Changes introduced in agents task

Similarly telemetry code started with a large dictionary built from OPEN_TELEMETRY_AGENT_NAMES and then for each prefix opentelemetr or otlp it aggregated subfields (framework, language, runtime versions) and tried to map them to the dictionary.
While now after the prefix-based aggregator runs it look at the returned buckets. For each agentBucket (with a key opentelemetry/<LANGUAGE>/<DISTRO_NAME> ) an entry is dynamically created.

      const agentDataWithOtel = await OPEN_TELEMETRY_BASE_AGENT_NAMES.reduce(
        async (prevJob, baseAgentName) => {
          const data = await prevJob;
  
          const response = await telemetryClient.search({
            index: [indices.error, indices.metric, indices.transaction],
            body: {
              track_total_hits: false,
              size: 0,
              timeout,
              query: {
                bool: {
                  filter: [{ prefix: { [AGENT_NAME]: baseAgentName } }, range1d],
                },
              },
              sort: {
                '@timestamp': 'desc',
              },
              aggs: {
                agent_name: {
                  terms: {
                    field: AGENT_NAME,
                    size: 1000,
                  },
                  aggs: agentNameAggs,
                },
              },
            },
          });
  
          if (!response.aggregations) {
            return data;
          }
  
          const dynamicAgentData: NonNullable<APMTelemetry['agents']> = {};
  
          for (const agentBucket of response.aggregations.agent_name.buckets) {
            const agentKey = agentBucket.key as string;
  
            dynamicAgentData[agentKey] = {

This PR should be enough in order to be able to differentiate our distro from any other OTel distro as requested.

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
  • Flaky Test Runner was used on any tests changed
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

Identify risks

Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.

@marcogavaz marcogavaz changed the title otel agent distro diferentiation Otel agent distro diferentiation in APM telemetry collection Jan 29, 2025
@marcogavaz
Copy link
Contributor Author

Copy link
Contributor

@kpatticha kpatticha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, I tested it locally and it works as expected

I left few comments related to the types.

@kpatticha
Copy link
Contributor

Not related to the work of this PR but I've noticed the following code.

 const servicesPerAgents: Record<AgentName, number> = {
        ...servicesPerAgentExcludingOtel,
        ...servicesPerOtelAgents,
      };

      return {
        has_any_services_per_official_agent: sum(Object.values(servicesPerAgents)) > 0,
        has_any_services: services?.hits?.total?.value > 0,
        services_per_agent: servicesPerAgents,
      };

and when I run my local environment with only otel data (using synthrace) I got the following response

"has_any_services_per_official_agent": true,
"has_any_services": true,
"services_per_agent": {
"dotnet": 0,
"go": 0,
"iOS/swift": 0,
"java": 0,
"js-base": 0,
"nodejs": 0,
"php": 0,
"python": 0,
"ruby": 0,
"rum-js": 0,
"android/java": 0,
"ios/swift": 0,
"otlp/elastic": 1,
"opentelemetry/java/elastic": 1
},

I would expect the has_any_services_per_official_agent to be false.

cc @marcogavaz @basepi

@marcogavaz
Copy link
Contributor Author

The code looks good to me, I tested it locally and it works as expected

I left few comments related to the types.

Great, i'll integrate your hints asap! Could we have maybe a meeting together where you show me how you test in local? (like on which data, how you set up the deployment, ...)? It would be great so i can bring also this new knowledge to the rest of observability bi team. I'll send you an invitation!

@marcogavaz marcogavaz requested a review from kpatticha February 5, 2025 13:44
@marcogavaz marcogavaz marked this pull request as ready for review February 5, 2025 13:44
@marcogavaz marcogavaz requested a review from a team February 5, 2025 13:44
@botelastic botelastic bot added the Team:obs-ux-infra_services - DEPRECATED DEPRECATED - Use Team:obs-presentation. label Feb 5, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

@marcogavaz marcogavaz self-assigned this Feb 6, 2025
@marcogavaz marcogavaz force-pushed the update-telemetry-collection branch from b704464 to 340325a Compare February 10, 2025 15:24
…t --include-path /api/status --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/fleet --include-path /api/dashboards --update'
@elasticmachine
Copy link
Contributor

elasticmachine commented Feb 10, 2025

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #3 / data telemetry collection tasks agents should return agent data per agent name
  • [job] [logs] Jest Tests #3 / data telemetry collection tasks agents should return agent data per agent name
  • [job] [logs] Jest Tests #3 / data telemetry collection tasks services should return services per agent name
  • [job] [logs] Jest Tests #3 / data telemetry collection tasks services should return services per agent name

Metrics [docs]

✅ unchanged

History

cc @marcogavaz

@marcogavaz marcogavaz closed this Feb 12, 2025
marcogavaz added a commit that referenced this pull request Feb 14, 2025
…#210775)

## Summary

This PRs follows the [closed PR
](#208770) and Closes
https://github.com/elastic/observability-bi/issues/489

As requested in [this comment
](#186281 (comment)) and
tracked by the issue #186281
this PR handles main changes introduced to address the request for
capturing open-ended OTel distro agent names with patter
`opentelemetry/<LANGUAGE>/<DISTRO_NAME>`. These changes ensure that new
agent names won’t be dropped in telemetry.



### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Mar 22, 2025
…elastic#210775)

## Summary

This PRs follows the [closed PR
](elastic#208770) and Closes
elastic/observability-bi#489

As requested in [this comment
](elastic#186281 (comment)) and
tracked by the issue elastic#186281
this PR handles main changes introduced to address the request for
capturing open-ended OTel distro agent names with patter
`opentelemetry/<LANGUAGE>/<DISTRO_NAME>`. These changes ensure that new
agent names won’t be dropped in telemetry.



### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
trentm added a commit that referenced this pull request Jun 17, 2025
…sue 489 (#210775) (#224160)

# Backport

This will backport the following commits from `main` to `8.19`:
- [APM telemetry collection Otel agent distro diferentiation - issue 489
(#210775)](#210775)

<!--- Backport version: 10.0.1 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Marco
Gavazzoni","email":"138492709+marcogavaz@users.noreply.github.com"},"sourceCommit":{"committedDate":"2025-02-14T14:11:29Z","message":"APM
telemetry collection Otel agent distro diferentiation - issue 489
(#210775)\n\n## Summary\r\n\r\nThis PRs follows the [closed
PR\r\n](#208770) and
Closes\r\nhttps://github.com/elastic/observability-bi/issues/489\r\n\r\nAs
requested in [this
comment\r\n](#186281 (comment))
and\r\ntracked by the issue
https://github.com/elastic/kibana/issues/186281\r\nthis PR handles main
changes introduced to address the request for\r\ncapturing open-ended
OTel distro agent names with
patter\r\n`opentelemetry/<LANGUAGE>/<DISTRO_NAME>`. These changes ensure
that new\r\nagent names won’t be dropped in
telemetry.\r\n\r\n\r\n\r\n### Checklist\r\n\r\nCheck the PR satisfies
following conditions. \r\n\r\nReviewers should verify this PR satisfies
this list as well.\r\n\r\n- [ ] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\r\n-
[
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added for features that require explanation or tutorials\r\n- [ ] [Unit
or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [ ] If a plugin
configuration key changed, check if it needs to be\r\nallowlisted in the
cloud and added to the
[docker\r\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\r\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\r\nchanges have been approved by the breaking-change committee.
The\r\n`release_note:breaking` label should be applied in these
situations.\r\n- [ ] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n- [ ] The PR description includes
the appropriate Release Notes section,\r\nand the correct
`release_note:*` label is applied per
the\r\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"1ee13cae705f2bf564a18c492bca4f9809245f7e","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["backport","backport:skip","Team:obs-ux-infra_services","v9.1.0","v8.19.0"],"title":"APM
telemetry collection Otel agent distro diferentiation - issue
489","number":210775,"url":"https://github.com/elastic/kibana/pull/210775","mergeCommit":{"message":"APM
telemetry collection Otel agent distro diferentiation - issue 489
(#210775)\n\n## Summary\r\n\r\nThis PRs follows the [closed
PR\r\n](#208770) and
Closes\r\nhttps://github.com/elastic/observability-bi/issues/489\r\n\r\nAs
requested in [this
comment\r\n](#186281 (comment))
and\r\ntracked by the issue
https://github.com/elastic/kibana/issues/186281\r\nthis PR handles main
changes introduced to address the request for\r\ncapturing open-ended
OTel distro agent names with
patter\r\n`opentelemetry/<LANGUAGE>/<DISTRO_NAME>`. These changes ensure
that new\r\nagent names won’t be dropped in
telemetry.\r\n\r\n\r\n\r\n### Checklist\r\n\r\nCheck the PR satisfies
following conditions. \r\n\r\nReviewers should verify this PR satisfies
this list as well.\r\n\r\n- [ ] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\r\n-
[
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added for features that require explanation or tutorials\r\n- [ ] [Unit
or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [ ] If a plugin
configuration key changed, check if it needs to be\r\nallowlisted in the
cloud and added to the
[docker\r\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\r\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\r\nchanges have been approved by the breaking-change committee.
The\r\n`release_note:breaking` label should be applied in these
situations.\r\n- [ ] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n- [ ] The PR description includes
the appropriate Release Notes section,\r\nand the correct
`release_note:*` label is applied per
the\r\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"1ee13cae705f2bf564a18c492bca4f9809245f7e"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/210775","number":210775,"mergeCommit":{"message":"APM
telemetry collection Otel agent distro diferentiation - issue 489
(#210775)\n\n## Summary\r\n\r\nThis PRs follows the [closed
PR\r\n](#208770) and
Closes\r\nhttps://github.com/elastic/observability-bi/issues/489\r\n\r\nAs
requested in [this
comment\r\n](#186281 (comment))
and\r\ntracked by the issue
https://github.com/elastic/kibana/issues/186281\r\nthis PR handles main
changes introduced to address the request for\r\ncapturing open-ended
OTel distro agent names with
patter\r\n`opentelemetry/<LANGUAGE>/<DISTRO_NAME>`. These changes ensure
that new\r\nagent names won’t be dropped in
telemetry.\r\n\r\n\r\n\r\n### Checklist\r\n\r\nCheck the PR satisfies
following conditions. \r\n\r\nReviewers should verify this PR satisfies
this list as well.\r\n\r\n- [ ] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\r\n-
[
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added for features that require explanation or tutorials\r\n- [ ] [Unit
or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [ ] If a plugin
configuration key changed, check if it needs to be\r\nallowlisted in the
cloud and added to the
[docker\r\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\r\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\r\nchanges have been approved by the breaking-change committee.
The\r\n`release_note:breaking` label should be applied in these
situations.\r\n- [ ] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n- [ ] The PR description includes
the appropriate Release Notes section,\r\nand the correct
`release_note:*` label is applied per
the\r\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by:
Elastic Machine
<elasticmachine@users.noreply.github.com>","sha":"1ee13cae705f2bf564a18c492bca4f9809245f7e"}},{"branch":"8.x","label":"v8.19.0","branchLabelMappingKey":"^v8.19.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Marco Gavazzoni <138492709+marcogavaz@users.noreply.github.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Team:obs-ux-infra_services - DEPRECATED DEPRECATED - Use Team:obs-presentation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants