Skip to content

[processor/elasticapmprocessor] set agentVersion, normalize dataset, service name values#1058

Merged
lanre-ade merged 18 commits into
elastic:mainfrom
lanre-ade:fix-mis-dataset-discrepancies
Mar 19, 2026
Merged

[processor/elasticapmprocessor] set agentVersion, normalize dataset, service name values#1058
lanre-ade merged 18 commits into
elastic:mainfrom
lanre-ade:fix-mis-dataset-discrepancies

Conversation

@lanre-ade

@lanre-ade lanre-ade commented Mar 2, 2026

Copy link
Copy Markdown
Contributor

Addresses data discrepancies between apm-data and mOTEL apm processor

Changes

• New sanitize package that centralizes dataset sanitization (truncation, restricted-char removal, label-key handling, service-name normalization) into processor/elasticapmprocessor/internal/sanitize, replacing scattered in-file
sanitizers.
• Adds a ServiceName resource enrichment step (enabled by default) that normalizes service names to match apm-data conventions.
• Sets deployment.environment (alias for service.environment) to "unset" when absent, matching the apm-data fallback.
• Stops forcibly disabling agent version enrichment so agent.version is populated consistently.
• Replaces hardcoded data_stream.* keys with exported elasticattr constants (adds DataStreamType).

Comment thread processor/elasticapmprocessor/internal/enrichments/resource.go Outdated
Comment thread processor/elasticapmprocessor/internal/sanitize/sanitize.go
@lanre-ade lanre-ade marked this pull request as ready for review March 10, 2026 19:58
@lanre-ade lanre-ade requested review from a team as code owners March 10, 2026 19:58
@lanre-ade lanre-ade changed the title [processor/elasticapmprocessor] normalize dataset values and remove restricted chars [processor/elasticapmprocessor] set agentVersion, normalize dataset, service name values Mar 10, 2026
@coderabbitai

coderabbitai Bot commented Mar 10, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0cdeca3b-12c2-4969-81a8-b13af10dae73

📥 Commits

Reviewing files that changed from the base of the PR and between 976aec0 and 1b57bea.

📒 Files selected for processing (3)
  • processor/elasticapmprocessor/go.mod
  • processor/elastictraceprocessor/go.mod
  • receiver/elasticapmintakereceiver/go.mod
✅ Files skipped from review due to trivial changes (3)
  • processor/elasticapmprocessor/go.mod
  • receiver/elasticapmintakereceiver/go.mod
  • processor/elastictraceprocessor/go.mod

📝 Walkthrough

Walkthrough

Adds a sanitize package and moves label/attribute sanitization there; introduces service name normalization and truncation utilities. Adds elasticattr constant DataStreamType. Expands resource enrichment config with DefaultDeploymentEnvironment and ServiceName, adds service name sanitization and default deployment.environment handling. Updates data stream routing to use elasticattr constants and sanitize functions. Updates tests and test fixtures to include agent.version: unknown and deployment.environment: unset.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can customize the high-level summary generated by CodeRabbit.

Configure the reviews.high_level_summary_instructions setting to provide custom instructions for generating the high-level summary.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@processor/elasticapmprocessor/internal/enrichments/logs_test.go`:
- Line 51: The test expectation is wrong because CleanServiceName sanitizes dots
to underscores; update the expected service name string in the logs test to
match CleanServiceName's output (change the expected "my.service" to
"my_service") so the assertion for the service.name field aligns with the
CleanServiceName behavior.

In `@processor/elasticapmprocessor/internal/routing/data_stream.go`:
- Line 68: The dataset name contains dots because the sanitizer doesn't replace
'.'; update the sanitizer so NormalizeServiceName replaces '.' with '_' by
adding '.' to the switch in replaceReservedRune
(processor/elasticapmprocessor/internal/sanitize/sanitize.go) so that the
function returns '_' for '.' (aligning with other reserved rune replacements),
rebuild and ensure attributes.PutStr(elasticattr.DataStreamDataset,
"apm.app."+sanitize.NormalizeServiceName(serviceName.Str())) produces
underscores for dotted service names.

In `@processor/elasticapmprocessor/testdata/elastic_span_http/output.yaml`:
- Around line 13-18: The golden fixture for elasticapmprocessor is out of sync:
the processor output must consistently include the new default resource
attributes "agent.version" and "deployment.environment"; update either the
processor logic in elasticapmprocessor (where resource attributes are populated)
to always set these defaults (e.g., ensure the code path that builds
Resource/Attributes populates keys "agent.version" and "deployment.environment"
with the expected string values instead of omitting them), or reconcile the test
golden file(s) to match the actual default values produced by the functions that
emit resource attributes so the build-and-test for elasticapmprocessor passes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bae6ba1c-68e2-46af-8f7c-e7d2ad193cfc

📥 Commits

Reviewing files that changed from the base of the PR and between 09d4d53 and cb596d6.

📒 Files selected for processing (24)
  • internal/elasticattr/attributes.go
  • processor/elasticapmprocessor/internal/ecs/ecs_translation.go
  • processor/elasticapmprocessor/internal/enrichments/config/config.go
  • processor/elasticapmprocessor/internal/enrichments/logs_test.go
  • processor/elasticapmprocessor/internal/enrichments/metric_test.go
  • processor/elasticapmprocessor/internal/enrichments/resource.go
  • processor/elasticapmprocessor/internal/enrichments/resource_test.go
  • processor/elasticapmprocessor/internal/routing/data_stream.go
  • processor/elasticapmprocessor/internal/routing/data_stream_test.go
  • processor/elasticapmprocessor/internal/sanitize/sanitize.go
  • processor/elasticapmprocessor/processor.go
  • processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_otlp_exception_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_servicename_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_hostname/logs_output.yaml
  • processor/elasticapmprocessor/testdata/elastic_span_db/output.yaml
  • processor/elasticapmprocessor/testdata/elastic_span_http/output.yaml
  • processor/elasticapmprocessor/testdata/elastic_span_messaging/output.yaml
  • processor/elasticapmprocessor/testdata/elastic_txn_db/output.yaml
  • processor/elasticapmprocessor/testdata/elastic_txn_messaging/output.yaml
  • processor/elasticapmprocessor/testdata/skip_enrichment/logs_false_ecs_output.yaml
  • processor/elasticapmprocessor/testdata/skip_enrichment/logs_false_output.yaml
  • processor/elasticapmprocessor/testdata/skip_enrichment/logs_true_ecs_output.yaml
  • processor/elasticapmprocessor/testdata/skip_enrichment/metrics_false_output.yaml
💤 Files with no reviewable changes (1)
  • processor/elasticapmprocessor/processor.go

Comment thread processor/elasticapmprocessor/internal/enrichments/logs_test.go Outdated
Comment thread processor/elasticapmprocessor/internal/routing/data_stream.go
Comment thread processor/elasticapmprocessor/testdata/elastic_span_http/output.yaml Outdated

@isaacaflores2 isaacaflores2 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks fine we just need to ensure some of the new changes only apply to ECS mode

CloudProjectName = "cloud.project.name"
ContainerImageTag = "container.image.tag"
DeviceManufacturer = "device.manufacturer"
DataStreamType = "data_stream.type"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note this will require some extra coordination to ensure all modules that use this package update the version of elasticattr they rely on and that a new version of elasticattr is tagged

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remember to change the relevant go.mod files to use the next version of the elasticattr package (it should be v0.38.0). Example commit. Without this external modules will face build errors unfortunately

}
cleaned := sanitize.CleanServiceName(s.serviceName)
if cleaned != s.serviceName {
resource.Attributes().PutStr(string(semconv.ServiceNameKey), cleaned)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Can we use attribute.PutStr to be consistent ?

@lanre-ade lanre-ade Mar 17, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attribute.PutStr only inserts the value if one doesn't already exist. Do we not want to always sanitize the service name?

@isaacaflores2 isaacaflores2 Mar 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still sanitize. I meant to just use attribute.PutStr instead of resource.Attributes().PutStr to be consistent

Comment thread processor/elasticapmprocessor/internal/enrichments/resource.go
Comment thread processor/elasticapmprocessor/internal/routing/data_stream.go
Comment thread processor/elasticapmprocessor/testdata/elastic_span_http/output.yaml Outdated
Comment thread processor/elasticapmprocessor/internal/enrichments/resource.go Outdated
Comment thread processor/elasticapmprocessor/internal/enrichments/config/config.go Outdated
Comment thread processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_output.yaml Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
processor/elasticapmprocessor/internal/enrichments/resource_test.go (1)

35-39: ⚠️ Potential issue | 🔴 Critical

Tests fail due to unexpected deployment.environment attribute.

The ecsResourceConfig() function inherits DefaultDeploymentEnvironment.Enabled = true from config.Enabled().Resource, causing enrichment to add deployment.environment: "unset" to test outputs. Tests like host_os_type_from_os_type_windows, host_os_type_from_os_type_linux, host_os_type_from_os_type_darwin, and host_os_type_from_os_type_aix fail at line 648 because their assertions don't expect this attribute.

Either disable DefaultDeploymentEnvironment in ecsResourceConfig() or update test assertions to include the expected field.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@processor/elasticapmprocessor/internal/enrichments/resource_test.go` around
lines 35 - 39, The tests are failing because ecsResourceConfig() inherits
DefaultDeploymentEnvironment.Enabled = true and adds
deployment.environment:"unset"; update ecsResourceConfig (the function that
returns config.ResourceConfig) to explicitly disable the default deployment
environment (set DefaultDeploymentEnvironment.Enabled = false) after copying
config.Enabled().Resource so the generated resource in tests no longer includes
deployment.environment, or alternatively update the affected test assertions to
expect the deployment.environment attribute; prefer changing ecsResourceConfig()
to set DefaultDeploymentEnvironment.Enabled = false to keep tests focused on
HostOSType behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@processor/elasticapmprocessor/internal/enrichments/resource_test.go`:
- Around line 35-39: The tests are failing because ecsResourceConfig() inherits
DefaultDeploymentEnvironment.Enabled = true and adds
deployment.environment:"unset"; update ecsResourceConfig (the function that
returns config.ResourceConfig) to explicitly disable the default deployment
environment (set DefaultDeploymentEnvironment.Enabled = false) after copying
config.Enabled().Resource so the generated resource in tests no longer includes
deployment.environment, or alternatively update the affected test assertions to
expect the deployment.environment attribute; prefer changing ecsResourceConfig()
to set DefaultDeploymentEnvironment.Enabled = false to keep tests focused on
HostOSType behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ad5b4a5e-e3ba-4730-ad1d-5b47f748c7d8

📥 Commits

Reviewing files that changed from the base of the PR and between 16a6f68 and e973034.

📒 Files selected for processing (7)
  • internal/elasticattr/attributes.go
  • processor/elasticapmprocessor/internal/ecs/ecs_translation.go
  • processor/elasticapmprocessor/internal/enrichments/config/config.go
  • processor/elasticapmprocessor/internal/enrichments/resource.go
  • processor/elasticapmprocessor/internal/enrichments/resource_test.go
  • processor/elasticapmprocessor/processor.go
  • processor/elasticapmprocessor/testdata/ecs/intake/traces_txn_db_output.yaml

@lanre-ade lanre-ade requested a review from isaacaflores2 March 17, 2026 18:25

@inge4pres inge4pres left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Lanre 👍🏼

Does this comment still requires work to be done here?

Comment thread processor/elasticapmprocessor/internal/enrichments/resource.go

@isaacaflores2 isaacaflores2 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I just had one comment on the elasticattr version

CloudProjectName = "cloud.project.name"
ContainerImageTag = "container.image.tag"
DeviceManufacturer = "device.manufacturer"
DataStreamType = "data_stream.type"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remember to change the relevant go.mod files to use the next version of the elasticattr package (it should be v0.38.0). Example commit. Without this external modules will face build errors unfortunately

}
cleaned := sanitize.CleanServiceName(s.serviceName)
if cleaned != s.serviceName {
resource.Attributes().PutStr(string(semconv.ServiceNameKey), cleaned)

@isaacaflores2 isaacaflores2 Mar 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still sanitize. I meant to just use attribute.PutStr instead of resource.Attributes().PutStr to be consistent

@lanre-ade

lanre-ade commented Mar 18, 2026

Copy link
Copy Markdown
Contributor Author

LGTM, thanks Lanre 👍🏼

Does this comment still requires work to be done here?

Thanks @inge4pres, not anymore. I addressed this in the subsequent commits.

@isaacaflores2 isaacaflores2 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@lanre-ade lanre-ade merged commit 4ecdfd7 into elastic:main Mar 19, 2026
19 checks passed
rogercoll pushed a commit to rogercoll/opentelemetry-collector-components that referenced this pull request Mar 20, 2026
…service name values (elastic#1058)

* feat: normalize dataset values and remove restricted chars

* feat: clean service name

* feat: default agent version to unknown

* fix: apm processor tests agent version

* feat: set service environment in enrichment

* feat: default deployment environment to 'unset'

* feat: clean service name during resource enrichments

* fix: update license

* fix: golden test files, dataset sanitization

* fix: make goporto

* fix: make gogenerate && make license-update

* fix: deploymentEnvironment, serviceName sanitazion only in ecsmode

* fix: update test fixtures

* fix: disable service name and default env configs

* fix: update the failing tests and fixtures to match ECS-only behavior

* fix: bump elasticattr version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants