[receiver/elasticapmintake]Group events to avoid duplicate resource and scope spans by lahsivjar · Pull Request #1214 · elastic/opentelemetry-collector-components

lahsivjar · 2026-05-08T14:53:28Z

Summary

Group trace and log events that share a resource attribute set into a single
ResourceSpans / ResourceLogs per processBatch call. The grouping key is
an xxhash fingerprint of the event fields that affect the resource map.
Resource-attribute writes and the fingerprint hash are both implemented as
visitors over a single walker (mappers.WalkResourceAttributes) — adding a
new resource field is a one-place edit that both paths pick up
automatically. Metric events are not collapsed (would risk duplicate metric
names within a ScopeMetrics); a follow-up will handle them.

Motivation

Profiling the intake hot path showed every event allocating its own
ResourceSpans/ResourceLogs plus a fresh resource attribute map, even
when consecutive events came from the same agent metadata.
pcommon.Map.PutStr boxing on identical-resource fan-out dominated
per-event allocations.

The walker pattern was added to make the change safe to maintain: keeping a
fingerprint and a resource-attribute writer manually in sync is exactly the
kind of two-list-drift bug that produces silent data loss (events with
different values for an unhashed field get merged, the second event's value
is dropped via Map.PutStr's update-on-existing semantics). One walker
makes the field set the single source of truth.

Benchmark

benchstat of BenchmarkProcessBatch + BenchmarkHandleStream* (the
new direct-path bench suite this PR adds) on origin/main vs this branch.
10 runs each, -benchtime=2s, Apple M4 Pro, Go 1.25.

Allocations per op (geomean −12.20%)

                                        │  main         │  branch                     │
ProcessBatch/global_labels_no_shadow              913.0      793.0   -13.14% (p=0.000)
ProcessBatch/global_labels_with_shadow            975.0      890.0    -8.72% (p=0.000)
HandleStream/transactions                        1.437k     1.402k    -2.44% (p=0.000)
HandleStream/spans                               2.130k     1.926k    -9.58% (p=0.000)
HandleStream/transactions_spans                  1.570k     1.381k   -12.04% (p=0.000)
HandleStream/errors                              1.470k     1.364k    -7.21% (p=0.000)
HandleStream/logs                                1.359k     1.202k   -11.55% (p=0.000)
HandleStream/metricsets                           604.0      609.0    +0.83% (p=0.000)
HandleStream/histograms                           197.0      201.0    +2.03% (p=0.000)
HandleStream/metric_global_label_shadow           323.0      328.0    +1.55% (p=0.000)
HandleStreamGlobalLabels/no_shadow                731.0      610.0   -16.55% (p=0.000)
HandleStreamGlobalLabels/with_shadow              792.0      707.0   -10.73% (p=0.000)
HandleStreamSize/transactions/10                  873.0      671.0   -23.14% (p=0.000)
HandleStreamSize/transactions/100                8.227k     6.207k   -24.55% (p=0.000)
HandleStreamSize/transactions/1000               81.77k     61.57k   -24.71% (p=0.000)
HandleStreamMixed/mixed/50                       3.308k     2.648k   -19.95% (p=0.000)
HandleStreamMixed/mixed/500                      40.72k     32.46k   -20.27% (p=0.000)
geomean                                          1.761k     1.546k   -12.20%

Bytes per op (geomean −3.24%)

                                        │  main         │  branch                     │
ProcessBatch/global_labels_no_shadow             1.098Mi    1.092Mi   -0.57% (p=0.000)
ProcessBatch/global_labels_with_shadow           1.102Mi    1.097Mi   -0.46% (p=0.000)
HandleStream/transactions                        1.102Mi    1.096Mi   -0.53% (p=0.000)
HandleStream/spans                               1.173Mi    1.148Mi   -2.15% (p=0.000)
HandleStream/transactions_spans                  1.115Mi    1.099Mi   -1.46% (p=0.000)
HandleStream/errors                              1.105Mi    1.092Mi   -1.15% (p=0.000)
HandleStream/logs                                1.105Mi    1.086Mi   -1.68% (p=0.000)
HandleStream/metricsets                          1.046Mi    1.047Mi   +0.01% (p=0.000)
HandleStream/histograms                          1.016Mi    1.016Mi   +0.01% (p=0.000)
HandleStream/metric_global_label_shadow          1.027Mi    1.027Mi   +0.03% (p=0.000)
HandleStreamGlobalLabels/no_shadow               1.072Mi    1.064Mi   -0.75% (p=0.000)
HandleStreamGlobalLabels/with_shadow             1.075Mi    1.069Mi   -0.55% (p=0.000)
HandleStreamSize/transactions/10                 1.085Mi    1.069Mi   -1.40% (p=0.000)
HandleStreamSize/transactions/100                1.805Mi    1.653Mi   -8.42% (p=0.000)
HandleStreamSize/transactions/1000               9.005Mi    7.488Mi  -16.85% (p=0.000)
HandleStreamMixed/mixed/50                       1.306Mi    1.256Mi   -3.85% (p=0.000)
HandleStreamMixed/mixed/500                      4.775Mi    4.145Mi  -13.19% (p=0.000)
geomean                                          1.397Mi    1.352Mi   -3.24%

Time per op (geomean −0.92%)

                                        │  main        │  branch                      │
ProcessBatch/global_labels_no_shadow             197.7µ     203.3µ        ~  (p=0.280)
ProcessBatch/global_labels_with_shadow           210.2µ     217.6µ    +3.52% (p=0.000)
HandleStream/transactions                        137.5µ     142.6µ    +3.65% (p=0.000)
HandleStream/spans                               149.0µ     153.7µ    +3.13% (p=0.000)
HandleStream/transactions_spans                  140.5µ     143.1µ    +1.80% (p=0.000)
HandleStream/errors                              134.0µ     135.1µ    +0.83% (p=0.007)
HandleStream/logs                                129.5µ     127.4µ    -1.62% (p=0.002)
HandleStream/metricsets                          89.18µ     93.48µ    +4.82% (p=0.000)
HandleStream/histograms                          57.22µ     56.74µ        ~  (p=0.063)
HandleStream/metric_global_label_shadow          66.83µ     68.37µ    +2.30% (p=0.000)
HandleStreamGlobalLabels/no_shadow               89.93µ     90.25µ        ~  (p=0.363)
HandleStreamGlobalLabels/with_shadow             93.80µ     96.93µ    +3.34% (p=0.002)
HandleStreamSize/transactions/10                 97.25µ     95.97µ    -1.32% (p=0.003)
HandleStreamSize/transactions/100                384.6µ     344.0µ   -10.55% (p=0.000)
HandleStreamSize/transactions/1000               3.190m     2.917m    -8.54% (p=0.000)
HandleStreamMixed/mixed/50                       190.0µ     183.3µ    -3.54% (p=0.000)
HandleStreamMixed/mixed/500                      1.645m     1.425m   -13.36% (p=0.000)
geomean                                          180.2µ     178.6µ    -0.92%

Throughput (B/s) for HandleStream* benches (geomean +1.47%)

                                        │  main         │  branch                     │
HandleStreamSize/transactions/100              58.13Mi/s   64.98Mi/s  +11.79% (p=0.000)
HandleStreamSize/transactions/1000             69.17Mi/s   75.63Mi/s   +9.33% (p=0.000)
HandleStreamMixed/mixed/50                     46.41Mi/s   48.11Mi/s   +3.67% (p=0.000)
HandleStreamMixed/mixed/500                    64.71Mi/s   74.69Mi/s  +15.42% (p=0.000)
geomean                                        34.16Mi/s   34.67Mi/s   +1.47%

Notes:

The largest wins concentrate on size-sweep and mixed workloads (the path
the collapse optimises). On 1000 transactions: −24.7% allocs, −16.9%
B/op, −8.5% ns/op, +9.3% throughput.
Metrics-only benches (metricsets, histograms,
metric_global_label_shadow) sit within ±2% on allocs as expected — the
metric path is intentionally not collapsed.
The small ns/op regressions on per-event-type benches (e.g.
HandleStream/transactions +3.65%) come from the walker's visitor
interface dispatch and the labels-sort pass; they're real but small, and
the alloc reduction more than compensates on workloads where multiple
events actually share a resource (i.e. anything beyond a 5-event
fixture).

…nd scope spans

coderabbitai · 2026-05-08T14:58:32Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 3a49b525-0f75-4452-b8a9-cb74e813232b

📥 Commits

Reviewing files that changed from the base of the PR and between edd2856 and 8ac2ab6.

📒 Files selected for processing (9)

receiver/elasticapmintakereceiver/go.mod
receiver/elasticapmintakereceiver/internal/mappers/resource_walker.go
receiver/elasticapmintakereceiver/resource_grouping.go
receiver/elasticapmintakereceiver/resource_grouping_test.go
receiver/elasticapmintakereceiver/testdata/spans_expected.yaml
receiver/elasticapmintakereceiver/testdata/spans_representative_count_expected.yaml
receiver/elasticapmintakereceiver/testdata/transactions_expected.yaml
receiver/elasticapmintakereceiver/testdata/transactions_spans_expected.yaml
receiver/elasticapmintakereceiver/testdata/unknown-span-type_expected.yaml

💤 Files with no reviewable changes (1)

receiver/elasticapmintakereceiver/testdata/spans_representative_count_expected.yaml

✅ Files skipped from review due to trivial changes (1)

receiver/elasticapmintakereceiver/go.mod

🚧 Files skipped from review as they are similar to previous changes (2)

receiver/elasticapmintakereceiver/resource_grouping.go
receiver/elasticapmintakereceiver/testdata/transactions_expected.yaml

📝 Walkthrough

<review_stack_artifact>

</review_stack_artifact>

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

🛠️ Update Documentation

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

lahsivjar · 2026-05-08T15:41:13Z

[For reviewers] I have intentionally left out metrics from grouping optimizations as it is not straightforward.

This reverts commit c241d68.

Copilot

Pull request overview

This PR reduces per-event allocations in the Elastic APM intake receiver by grouping trace and log events that share the same resource attributes into a single ResourceSpans / ResourceLogs per processBatch call, using an xxhash-based resource fingerprint derived from a shared resource-attribute walker.

Changes:

Added per-batch resource grouping (signalGroups) and a stable resource fingerprint, reusing ScopeSpans / ScopeLogs for identical resources.
Introduced mappers.WalkResourceAttributes as the single source of truth for both resource attribute writes and fingerprinting.
Added/updated benchmarks and updated golden testdata YAML outputs to reflect the new grouping behavior.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
receiver/elasticapmintakereceiver/receiver.go	Uses resource fingerprint + per-batch caches to group trace/log events by resource; routes span/log conversion via cached scopes; switches resource attribute mapping to the walker visitor.
receiver/elasticapmintakereceiver/resource_grouping.go	Implements `signalGroups` caches and the xxhash(v2) resource fingerprint visitor.
receiver/elasticapmintakereceiver/resource_grouping_test.go	Adds a unit test ensuring numeric label float values don’t incorrectly merge resources.
receiver/elasticapmintakereceiver/internal/mappers/resource_walker.go	Adds `WalkResourceAttributes` walker + `ResourceAttrVisitor` to drive both hashing and pcommon writes from one field list.
receiver/elasticapmintakereceiver/internal/mappers/intakeV2ToSemConv.go	Removes the old resource-attribute translation function in favor of the new walker-based approach.
receiver/elasticapmintakereceiver/internal/mappers/intakeV2ToElasticSpecificFields.go	Removes elastic-specific resource attribute mapping + label mapping (now handled by the walker).
receiver/elasticapmintakereceiver/internal/mappers/intakeV2ToDerivedFields.go	Removes derived resource attributes for agent name/version (now handled by the walker).
receiver/elasticapmintakereceiver/receiver_bench_test.go	Adds direct-path `HandleStream*` benchmark suite and synthetic payload generators.
receiver/elasticapmintakereceiver/go.mod	Promotes `github.com/cespare/xxhash/v2` to a direct dependency.
receiver/elasticapmintakereceiver/testdata/unknown-span-type_expected.yaml	Updates expected output to reflect grouped resource spans and reordered resource attributes.
receiver/elasticapmintakereceiver/testdata/transactions_spans_expected.yaml	Updates expected output to reflect grouped resource spans and reordered resource attributes.
receiver/elasticapmintakereceiver/testdata/transactions_expected.yaml	Updates expected output to reflect grouped resource spans and reordered resource attributes.
receiver/elasticapmintakereceiver/testdata/spans_representative_count_expected.yaml	Updates expected output to reflect resource grouping (removes duplicated resources).
receiver/elasticapmintakereceiver/testdata/spans_expected.yaml	Updates expected output to reflect grouped resource spans and reordered resource attributes.
receiver/elasticapmintakereceiver/testdata/span-links_expected.yaml	Updates expected output to reflect resource grouping (removes duplicated resources).
receiver/elasticapmintakereceiver/testdata/logs_expected.yaml	Updates expected output to reflect grouped resource logs and reordered resource attributes/log records.
receiver/elasticapmintakereceiver/testdata/invalid_ids_expected.yaml	Updates expected output to reflect resource grouping (removes duplicated resources).
receiver/elasticapmintakereceiver/testdata/hostdata_expected.yaml	Updates expected output to reflect resource grouping/reordering of resource attributes.
receiver/elasticapmintakereceiver/testdata/errors_expected.yaml	Updates expected output to reflect grouped resource logs and reordered resource attributes/log records.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+// Including the key in the hash makes write order irrelevant for fields
+// the visitor sees as Put*(key, value) — re-ordering the walker visits
+// would produce the same fingerprint for the same set of attributes.


+				fmt.Sprintf("tx%014x", base+uint64(i)),
+				fmt.Sprintf("tx%014xtx%014x", base+uint64(i), base+uint64(i)),
+				1_000_000+(base+uint64(i))*1_000,
+			)
+		}
+		for i := range 8 {
+			fmt.Fprintf(&buf,
+				`{"span": {"id": %q, "trace_id": %q, "transaction_id": %q, "parent_id": %q, "name": "SELECT *", "type": "db.postgresql.query", "start": 1, "duration": 2, "timestamp": %d}}`+"\n",
+				fmt.Sprintf("sp%014x", base+uint64(i)),
+				fmt.Sprintf("tx%014xtx%014x", base+uint64(i), base+uint64(i)),
+				fmt.Sprintf("tx%014x", base+uint64(i)),
+				fmt.Sprintf("tx%014x", base+uint64(i)),
+				1_000_000+(base+uint64(i))*1_000+1,
+			)
+		}
+		fmt.Fprintf(&buf,
+			`{"error": {"id": %q, "trace_id": %q, "transaction_id": %q, "parent_id": %q, "timestamp": %d, "log": {"message": "boom"}}}`+"\n",
+			fmt.Sprintf("er%014x", base),
+			fmt.Sprintf("tx%014xtx%014x", base, base),
+			fmt.Sprintf("tx%014x", base),
+			fmt.Sprintf("tx%014x", base),


carsonip

lgtm thanks, the approach is sound. A risk of hash collision as discussed during private sync but risk is low.

carsonip · 2026-05-22T11:43:51Z

+			if k == "" || nv == nil {
+				continue
+			}
+			v.PutDouble("numeric_labels."+k, nv.Value)


q: not related to this PR but is it true that numeric labels always only use .Value, not .Values? Asking because there is .Values handling in apm-data for numeric labels.

If this turns out to be a bug I'm happy to defer it in a different PR to keep this PR clean.

Intake v2 will always produce only .Value, input/elasticapm/internal/modeldecoder always decodes into .Value. I think the .Values is used only for OTel via APM which this receiver doesn't need to deal with.

[receiver/elasticapmintake]Group events to avoid duplicate resource a…

b43aaa0

…nd scope spans

lahsivjar requested review from a team as code owners May 8, 2026 14:53

make gotidy

9fcbd2f

make gogenerate && make license-update

9bf3d3f

lahsivjar requested review from axw and vigneshshanmugam May 8, 2026 21:27

lahsivjar added 8 commits May 12, 2026 12:26

WIP

c241d68

Merge branch 'main' into optimize-resource-spans

6042070

Revert "WIP"

962a76c

This reverts commit c241d68.

Merge branch 'main' into optimize-resource-spans

edd2856

Update tests based on grouping changes

d3ac890

handle floats better

a5198d3

Merge remote-tracking branch 'origin/main' into optimize-resource-spans

1180891

minor label priority

8ac2ab6

carsonip requested a review from Copilot May 22, 2026 11:37

Copilot started reviewing on behalf of carsonip May 22, 2026 11:37 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

carsonip approved these changes May 22, 2026

View reviewed changes

lahsivjar merged commit 6937934 into elastic:main May 26, 2026
19 checks passed

lahsivjar deleted the optimize-resource-spans branch May 26, 2026 17:10

lahsivjar mentioned this pull request May 26, 2026

[receiver/elasticapm]Fix benchmarks to produce correct IDs #1235

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/elasticapmintake]Group events to avoid duplicate resource and scope spans#1214

[receiver/elasticapmintake]Group events to avoid duplicate resource and scope spans#1214
lahsivjar merged 11 commits into
elastic:mainfrom
lahsivjar:optimize-resource-spans

lahsivjar commented May 8, 2026

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Uh oh!

lahsivjar commented May 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

carsonip left a comment

Uh oh!

carsonip May 22, 2026

Uh oh!

lahsivjar May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lahsivjar commented May 8, 2026

Summary

Motivation

Benchmark

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lahsivjar commented May 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

carsonip left a comment

Choose a reason for hiding this comment

Uh oh!

carsonip May 22, 2026

Choose a reason for hiding this comment

Uh oh!

lahsivjar May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented May 8, 2026 •

edited

Loading