[processor/elasticapmprocessor] set agentVersion, normalize dataset, service name values#1058
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (3)
📝 WalkthroughWalkthroughAdds a sanitize package and moves label/attribute sanitization there; introduces service name normalization and truncation utilities. Adds elasticattr constant ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip You can customize the high-level summary generated by CodeRabbit.Configure the |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@processor/elasticapmprocessor/internal/enrichments/logs_test.go`:
- Line 51: The test expectation is wrong because CleanServiceName sanitizes dots
to underscores; update the expected service name string in the logs test to
match CleanServiceName's output (change the expected "my.service" to
"my_service") so the assertion for the service.name field aligns with the
CleanServiceName behavior.
In `@processor/elasticapmprocessor/internal/routing/data_stream.go`:
- Line 68: The dataset name contains dots because the sanitizer doesn't replace
'.'; update the sanitizer so NormalizeServiceName replaces '.' with '_' by
adding '.' to the switch in replaceReservedRune
(processor/elasticapmprocessor/internal/sanitize/sanitize.go) so that the
function returns '_' for '.' (aligning with other reserved rune replacements),
rebuild and ensure attributes.PutStr(elasticattr.DataStreamDataset,
"apm.app."+sanitize.NormalizeServiceName(serviceName.Str())) produces
underscores for dotted service names.
In `@processor/elasticapmprocessor/testdata/elastic_span_http/output.yaml`:
- Around line 13-18: The golden fixture for elasticapmprocessor is out of sync:
the processor output must consistently include the new default resource
attributes "agent.version" and "deployment.environment"; update either the
processor logic in elasticapmprocessor (where resource attributes are populated)
to always set these defaults (e.g., ensure the code path that builds
Resource/Attributes populates keys "agent.version" and "deployment.environment"
with the expected string values instead of omitting them), or reconcile the test
golden file(s) to match the actual default values produced by the functions that
emit resource attributes so the build-and-test for elasticapmprocessor passes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bae6ba1c-68e2-46af-8f7c-e7d2ad193cfc
📒 Files selected for processing (24)
internal/elasticattr/attributes.goprocessor/elasticapmprocessor/internal/ecs/ecs_translation.goprocessor/elasticapmprocessor/internal/enrichments/config/config.goprocessor/elasticapmprocessor/internal/enrichments/logs_test.goprocessor/elasticapmprocessor/internal/enrichments/metric_test.goprocessor/elasticapmprocessor/internal/enrichments/resource.goprocessor/elasticapmprocessor/internal/enrichments/resource_test.goprocessor/elasticapmprocessor/internal/routing/data_stream.goprocessor/elasticapmprocessor/internal/routing/data_stream_test.goprocessor/elasticapmprocessor/internal/sanitize/sanitize.goprocessor/elasticapmprocessor/processor.goprocessor/elasticapmprocessor/testdata/ecs/elastic_error/logs_otlp_exception_output.yamlprocessor/elasticapmprocessor/testdata/ecs/elastic_error/logs_output.yamlprocessor/elasticapmprocessor/testdata/ecs/elastic_error/logs_servicename_output.yamlprocessor/elasticapmprocessor/testdata/ecs/elastic_hostname/logs_output.yamlprocessor/elasticapmprocessor/testdata/elastic_span_db/output.yamlprocessor/elasticapmprocessor/testdata/elastic_span_http/output.yamlprocessor/elasticapmprocessor/testdata/elastic_span_messaging/output.yamlprocessor/elasticapmprocessor/testdata/elastic_txn_db/output.yamlprocessor/elasticapmprocessor/testdata/elastic_txn_messaging/output.yamlprocessor/elasticapmprocessor/testdata/skip_enrichment/logs_false_ecs_output.yamlprocessor/elasticapmprocessor/testdata/skip_enrichment/logs_false_output.yamlprocessor/elasticapmprocessor/testdata/skip_enrichment/logs_true_ecs_output.yamlprocessor/elasticapmprocessor/testdata/skip_enrichment/metrics_false_output.yaml
💤 Files with no reviewable changes (1)
- processor/elasticapmprocessor/processor.go
isaacaflores2
left a comment
There was a problem hiding this comment.
The logic looks fine we just need to ensure some of the new changes only apply to ECS mode
| CloudProjectName = "cloud.project.name" | ||
| ContainerImageTag = "container.image.tag" | ||
| DeviceManufacturer = "device.manufacturer" | ||
| DataStreamType = "data_stream.type" |
There was a problem hiding this comment.
Please note this will require some extra coordination to ensure all modules that use this package update the version of elasticattr they rely on and that a new version of elasticattr is tagged
There was a problem hiding this comment.
Please remember to change the relevant go.mod files to use the next version of the elasticattr package (it should be v0.38.0). Example commit. Without this external modules will face build errors unfortunately
| } | ||
| cleaned := sanitize.CleanServiceName(s.serviceName) | ||
| if cleaned != s.serviceName { | ||
| resource.Attributes().PutStr(string(semconv.ServiceNameKey), cleaned) |
There was a problem hiding this comment.
nitpick: Can we use attribute.PutStr to be consistent ?
There was a problem hiding this comment.
attribute.PutStr only inserts the value if one doesn't already exist. Do we not want to always sanitize the service name?
There was a problem hiding this comment.
We should still sanitize. I meant to just use attribute.PutStr instead of resource.Attributes().PutStr to be consistent
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
processor/elasticapmprocessor/internal/enrichments/resource_test.go (1)
35-39:⚠️ Potential issue | 🔴 CriticalTests fail due to unexpected
deployment.environmentattribute.The
ecsResourceConfig()function inheritsDefaultDeploymentEnvironment.Enabled = truefromconfig.Enabled().Resource, causing enrichment to adddeployment.environment: "unset"to test outputs. Tests likehost_os_type_from_os_type_windows,host_os_type_from_os_type_linux,host_os_type_from_os_type_darwin, andhost_os_type_from_os_type_aixfail at line 648 because their assertions don't expect this attribute.Either disable
DefaultDeploymentEnvironmentinecsResourceConfig()or update test assertions to include the expected field.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processor/elasticapmprocessor/internal/enrichments/resource_test.go` around lines 35 - 39, The tests are failing because ecsResourceConfig() inherits DefaultDeploymentEnvironment.Enabled = true and adds deployment.environment:"unset"; update ecsResourceConfig (the function that returns config.ResourceConfig) to explicitly disable the default deployment environment (set DefaultDeploymentEnvironment.Enabled = false) after copying config.Enabled().Resource so the generated resource in tests no longer includes deployment.environment, or alternatively update the affected test assertions to expect the deployment.environment attribute; prefer changing ecsResourceConfig() to set DefaultDeploymentEnvironment.Enabled = false to keep tests focused on HostOSType behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@processor/elasticapmprocessor/internal/enrichments/resource_test.go`:
- Around line 35-39: The tests are failing because ecsResourceConfig() inherits
DefaultDeploymentEnvironment.Enabled = true and adds
deployment.environment:"unset"; update ecsResourceConfig (the function that
returns config.ResourceConfig) to explicitly disable the default deployment
environment (set DefaultDeploymentEnvironment.Enabled = false) after copying
config.Enabled().Resource so the generated resource in tests no longer includes
deployment.environment, or alternatively update the affected test assertions to
expect the deployment.environment attribute; prefer changing ecsResourceConfig()
to set DefaultDeploymentEnvironment.Enabled = false to keep tests focused on
HostOSType behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ad5b4a5e-e3ba-4730-ad1d-5b47f748c7d8
📒 Files selected for processing (7)
internal/elasticattr/attributes.goprocessor/elasticapmprocessor/internal/ecs/ecs_translation.goprocessor/elasticapmprocessor/internal/enrichments/config/config.goprocessor/elasticapmprocessor/internal/enrichments/resource.goprocessor/elasticapmprocessor/internal/enrichments/resource_test.goprocessor/elasticapmprocessor/processor.goprocessor/elasticapmprocessor/testdata/ecs/intake/traces_txn_db_output.yaml
inge4pres
left a comment
There was a problem hiding this comment.
LGTM, thanks Lanre 👍🏼
Does this comment still requires work to be done here?
isaacaflores2
left a comment
There was a problem hiding this comment.
Thanks for the updates. I just had one comment on the elasticattr version
| CloudProjectName = "cloud.project.name" | ||
| ContainerImageTag = "container.image.tag" | ||
| DeviceManufacturer = "device.manufacturer" | ||
| DataStreamType = "data_stream.type" |
There was a problem hiding this comment.
Please remember to change the relevant go.mod files to use the next version of the elasticattr package (it should be v0.38.0). Example commit. Without this external modules will face build errors unfortunately
| } | ||
| cleaned := sanitize.CleanServiceName(s.serviceName) | ||
| if cleaned != s.serviceName { | ||
| resource.Attributes().PutStr(string(semconv.ServiceNameKey), cleaned) |
There was a problem hiding this comment.
We should still sanitize. I meant to just use attribute.PutStr instead of resource.Attributes().PutStr to be consistent
Thanks @inge4pres, not anymore. I addressed this in the subsequent commits. |
isaacaflores2
left a comment
There was a problem hiding this comment.
Thanks for the updates!
…service name values (elastic#1058) * feat: normalize dataset values and remove restricted chars * feat: clean service name * feat: default agent version to unknown * fix: apm processor tests agent version * feat: set service environment in enrichment * feat: default deployment environment to 'unset' * feat: clean service name during resource enrichments * fix: update license * fix: golden test files, dataset sanitization * fix: make goporto * fix: make gogenerate && make license-update * fix: deploymentEnvironment, serviceName sanitazion only in ecsmode * fix: update test fixtures * fix: disable service name and default env configs * fix: update the failing tests and fixtures to match ECS-only behavior * fix: bump elasticattr version
Addresses data discrepancies between apm-data and mOTEL apm processor
Changes
• New
sanitizepackage that centralizes dataset sanitization (truncation, restricted-char removal, label-key handling, service-name normalization) intoprocessor/elasticapmprocessor/internal/sanitize, replacing scattered in-filesanitizers.
• Adds a ServiceName resource enrichment step (enabled by default) that normalizes service names to match apm-data conventions.
• Sets
deployment.environment(alias forservice.environment) to "unset" when absent, matching the apm-data fallback.• Stops forcibly disabling agent version enrichment so
agent.versionis populated consistently.• Replaces hardcoded
data_stream.*keys with exported elasticattr constants (adds DataStreamType).