Skip to content

Fix ECS label attribute handling in TranslateResourceMetadata and Remap* functions#1226

Merged
lahsivjar merged 3 commits into
elastic:mainfrom
lahsivjar:fix-panic
May 18, 2026
Merged

Fix ECS label attribute handling in TranslateResourceMetadata and Remap* functions#1226
lahsivjar merged 3 commits into
elastic:mainfrom
lahsivjar:fix-panic

Conversation

@lahsivjar

@lahsivjar lahsivjar commented May 18, 2026

Copy link
Copy Markdown
Contributor

Bugs fixed

1. Panic in Remap* functions during map iteration

RemapLogRecordAttributesToECSLabels (and the equivalent span/metric variants) used Range to iterate the attribute map while inserting new keys into it. This caused a panic for three cases: map/bytes/empty-typed attribute values (removed with no compensating insert), a plain attribute whose sanitized destination key already existed, and two attributes whose keys both sanitize to the same output key. Fixed by switching to RemoveIf with deferred insertions collected into a slice and applied after iteration completes.

2. APM intake path double-prefixing labels.* resource attributes

The APM intake receiver places labels.<key> and numeric_labels.<key> directly on resource attributes (already sanitized). TranslateResourceMetadata was treating these as raw unsupported keys and fully re-normalizing them, producing labels.labels_my_value instead of labels.my_value. Fixed by adding a sanitizeExistingLabels bool parameter: the APM enricher path (telemetry.sdk.name == "ElasticAPM") passes true, which sanitizes only the suffix of existing labels.*/numeric_labels.* keys; the OTel path passes false and retains the existing full re-normalization behaviour.

Validation

Both behaviours are validated against apm-data (input/otlp/traces_test.go) — apm-data silently drops map/bytes/empty typed attributes and applies last-writer-wins on key collisions, which matches the processor's behaviour after this fix.

@lahsivjar lahsivjar changed the title Reproduce panic bug Fix ECS label attribute handling in TranslateResourceMetadata and Remap* functions May 18, 2026
Comment on lines +54 to +56
// When false (OTel path), all unsupported attributes — including any that
// already carry a labels.* prefix — are treated as raw keys and re-normalized
// from scratch.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[For reviewers] This is validated in apm-data, see https://github.com/lahsivjar/apm-data/pull/new/mOTLP-test

- key: labels.unsupported_key
value:
stringValue: foo
- key: labels.labels_my_value

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[For reviewers] The double normalization (labels.labels*) is expected, see the prior comment about apm-data behaviour for MIS OTel mode - I dond't change the input data so this is emulating OTel data sent to MIS with attributes set as labels.xxx.

if truncated != value.Str() {
attributes.PutStr(key, truncated)
if truncated := TruncateToECSMaxLength(value.Str()); truncated != value.Str() {
value.SetStr(truncated)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[For reviewers] We are only changing the value here - this is okay but what we were doing previously was not okay as we were inserting a new attribute while iterating over the pcommon.Map

Comment on lines +391 to +471
{
// Map-typed attributes have no supported label representation and are silently dropped.
name: "map-typed attribute dropped",
setAttrs: func(attrs pcommon.Map) {
for _, k := range []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "a9"} {
attrs.PutStr(k, "v")
}
attrs.PutEmptyMap("nested.attr")
},
want: map[string]any{
"labels.a0": "v", "labels.a1": "v", "labels.a2": "v",
"labels.a3": "v", "labels.a4": "v", "labels.a5": "v",
"labels.a6": "v", "labels.a7": "v", "labels.a8": "v",
"labels.a9": "v",
},
wantAbsent: []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "a9", "nested.attr"},
},
{
// Two attributes that reduce to the same sanitized key both map to the same
// output label; the second insertion overwrites the first (last writer wins).
name: "key collision last writer wins",
setAttrs: func(attrs pcommon.Map) {
for _, k := range []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8"} {
attrs.PutStr(k, "v")
}
attrs.PutStr("foo.bar", "first")
attrs.PutStr("foo_bar", "second") // both sanitize to "labels.foo_bar"
},
want: map[string]any{
"labels.a0": "v", "labels.a1": "v", "labels.a2": "v",
"labels.a3": "v", "labels.a4": "v", "labels.a5": "v",
"labels.a6": "v", "labels.a7": "v", "labels.a8": "v",
"labels.foo_bar": "second",
},
wantAbsent: []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "foo.bar", "foo_bar"},
},
{
// "labels.foo.bar" and "labels.foo_bar" both sanitize to the same output key
// "labels.labels_foo_bar". Last writer wins.
name: "labels-prefixed key collision last writer wins",
setAttrs: func(attrs pcommon.Map) {
for _, k := range []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8"} {
attrs.PutStr(k, "v")
}
attrs.PutStr("labels.foo.bar", "dotted")
attrs.PutStr("labels.foo_bar", "plain") // both sanitize to "labels.labels_foo_bar"
},
want: map[string]any{
"labels.a0": "v", "labels.a1": "v", "labels.a2": "v",
"labels.a3": "v", "labels.a4": "v", "labels.a5": "v",
"labels.a6": "v", "labels.a7": "v", "labels.a8": "v",
"labels.labels_foo_bar": "plain",
},
wantAbsent: []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "labels.foo.bar", "labels.foo_bar"},
},
{
// Regression: mutating pcommon.Map inside Range caused an index-out-of-bounds panic.
// With uniform key sanitization, plain "a0" → "labels.a0" and "labels.a0" →
// "labels.labels_a0": no collision, both produce distinct output keys.
name: "no panic with 11 attrs including a labels-prefixed key",
setAttrs: func(attrs pcommon.Map) {
for _, k := range []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "a9"} {
attrs.PutStr(k, "v")
}
attrs.PutStr("labels.a0", "existing")
},
want: map[string]any{
"labels.a0": "v",
"labels.labels_a0": "existing",
"labels.a1": "v",
"labels.a2": "v",
"labels.a3": "v",
"labels.a4": "v",
"labels.a5": "v",
"labels.a6": "v",
"labels.a7": "v",
"labels.a8": "v",
"labels.a9": "v",
},
wantAbsent: []string{"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7", "a8", "a9"},
},

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[For reviewers] These are the test cases that would fail/cause panic.

@lahsivjar lahsivjar marked this pull request as ready for review May 18, 2026 20:08
@lahsivjar lahsivjar requested review from a team as code owners May 18, 2026 20:08
@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 8076c8b9-47a0-4640-aa40-de062f58af01

📥 Commits

Reviewing files that changed from the base of the PR and between 1dd8b51 and 3403a93.

📒 Files selected for processing (27)
  • processor/elasticapmprocessor/internal/ecs/ecs_translation.go
  • processor/elasticapmprocessor/internal/ecs/ecs_translation_test.go
  • processor/elasticapmprocessor/internal/ecs/util.go
  • processor/elasticapmprocessor/internal/enrichments/enricher.go
  • processor/elasticapmprocessor/internal/enrichments/log_enricher.go
  • processor/elasticapmprocessor/internal/enrichments/metric_enricher.go
  • processor/elasticapmprocessor/internal/enrichments/metric_test.go
  • processor/elasticapmprocessor/internal/enrichments/resource.go
  • processor/elasticapmprocessor/internal/enrichments/resource_test.go
  • processor/elasticapmprocessor/internal/enrichments/trace_enricher.go
  • processor/elasticapmprocessor/internal/routing/data_stream.go
  • processor/elasticapmprocessor/internal/sanitize/sanitize.go
  • processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_otlp_exception_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_error/logs_servicename_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_log/output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_metric/output.yaml
  • processor/elasticapmprocessor/testdata/ecs/elastic_span_db/output.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/logs_error_input.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/logs_input.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/logs_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/metrics_input.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/metrics_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/traces_input.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/traces_output.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/traces_txn_db_input.yaml
  • processor/elasticapmprocessor/testdata/ecs/intake/traces_txn_db_output.yaml
💤 Files with no reviewable changes (1)
  • processor/elasticapmprocessor/internal/sanitize/sanitize.go

📝 Walkthrough

Walkthrough

This PR refactors ECS attribute translation in the APM processor by introducing a deferred-write pattern and adding a conditional label sanitization strategy. The main change adds a sanitizeExistingLabels boolean flag to TranslateResourceMetadata, allowing APM-intake mode to sanitize only the suffixes of existing labels.*/numeric_labels.* keys rather than fully re-normalizing them. The translation pipeline switches from in-place mutation to a RemoveIf predicate + deferred toAppend buffer pattern across all remap functions. Simultaneously, the centralized sanitize package is removed and its logic localized into ecs, resource, and data_stream modules. A new util.go defines TruncateToECSMaxLength with a 1024-rune keyword maximum. All enrichers (log, metric, trace) are updated to carry and propagate the flag, with APM enrichers enabling it and OTel enrichers disabling it. Test fixtures are updated to reflect the label re-prefixing behavior where unsupported attributes become labels.labels_* or numeric_labels.numeric_labels_*.

Possibly related PRs

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vigneshshanmugam vigneshshanmugam left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the detailed test-cases, made it easier for the review.

)

// kv is a key-value pair for collecting deferred attribute insertions after a RemoveIf pass.
type kv struct {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cant we use the otlpcommon.KeyValue ? I havent checked it though, just asking.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's from the proto package (and protocommon.KeyValue is also a proto struct), I don't think we need that heavy dependency just for this purpose. Happy to discuss further - can do a followup if we decide later.

return kv{}
}
switch slice.At(0).Type() {
// TODO(lahsivjar): Can we assume all are same type and just use pcommon.Value#CopyTo?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ was wondering the same as the logic seems to be the same for everything.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am not sure too. I didn't want to do too drastic of a change to start with so will review this a bit later.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@lahsivjar lahsivjar merged commit 8ccca8c into elastic:main May 18, 2026
19 checks passed
@lahsivjar lahsivjar deleted the fix-panic branch May 18, 2026 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants