[processor/elasticapmprocessor] Fix otlp span discrepancies#1163
[processor/elasticapmprocessor] Fix otlp span discrepancies#1163lanre-ade wants to merge 18 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new exported function Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
processor/elasticapmprocessor/internal/enrichments/span.go (1)
560-608:⚠️ Potential issue | 🟠 MajorDo not override explicit
peer.servicewith URL-derived HTTP values.Both
setServiceTargetandsetDestinationServiceseed their output froms.peerService, but the new HTTP path overwrites that wheneverhttpDestinationDetails()succeeds. This regresses the existing precedence for spans that already providepeer.service, and will rewrite service target / destination resource unexpectedly.Suggested fix
case s.isHTTP: targetType = "http" - if details, ok := s.httpDestinationDetails(); ok && details.serviceTargetName != "" { + if targetName == "" { + if details, ok := s.httpDestinationDetails(); ok && details.serviceTargetName != "" { + targetName = details.serviceTargetName + } else if resource := getHostPort( + s.urlFull, s.urlDomain, s.urlPort, + s.serverAddress, s.serverPort, // fallback + ); resource != "" { + targetName = resource + } + } - targetName = details.serviceTargetName - } else if resource := getHostPort( - s.urlFull, s.urlDomain, s.urlPort, - s.serverAddress, s.serverPort, // fallback - ); resource != "" { - targetName = resource - } }case s.isRPC, s.isHTTP: if s.isHTTP { if details, ok := s.httpDestinationDetails(); ok { attribute.PutNonEmptyStr(span.Attributes(), "destination.address", details.destinationAddress) if details.destinationPort > 0 { attribute.PutInt(span.Attributes(), "destination.port", details.destinationPort) } attribute.PutNonEmptyStr(span.Attributes(), "url.original", details.urlOriginal) attribute.PutNonEmptyStr(span.Attributes(), elasticattr.SpanDestinationServiceName, details.spanDestinationServiceName) attribute.PutStr(span.Attributes(), elasticattr.SpanDestinationServiceType, "external") - destnResource = details.spanDestinationServiceResource + if destnResource == "" { + destnResource = details.spanDestinationServiceResource + } break } }Also applies to: 611-661
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processor/elasticapmprocessor/internal/enrichments/span.go` around lines 560 - 608, The HTTP branch in spanEnrichmentContext.setServiceTarget currently overwrites an explicitly provided s.peerService with URL-derived values; update the logic so targetName is only taken from httpDestinationDetails() or getHostPort(...) when s.peerService is empty (i.e., preserve s.peerService precedence). Apply the same change to the analogous logic in setDestinationService (the code around httpDestinationDetails(), getHostPort(), and assignments to targetName/resource) so neither method overwrites an existing peer.service when populating service target or destination attributes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@processor/elasticapmprocessor/internal/enrichments/span.go`:
- Around line 127-129: The span attribute translator is removing keys needed by
enrichment (processor.event and event.outcome) before enrichment checks; either
move the call to ecs.TranslateSpanAttributes(span.Attributes()) so it runs after
enrichment logic that calls isElasticTransaction and setEventOutcome, or update
the translator to whitelist/skip those keys (processor.event, event.outcome) so
they remain available for enrichment; change the code around
cfg.Span.TranslateUnsupportedAttributes.Enabled to defer translation until after
enrichment or adjust ecs.TranslateSpanAttributes to ignore those specific
attribute names.
---
Outside diff comments:
In `@processor/elasticapmprocessor/internal/enrichments/span.go`:
- Around line 560-608: The HTTP branch in spanEnrichmentContext.setServiceTarget
currently overwrites an explicitly provided s.peerService with URL-derived
values; update the logic so targetName is only taken from
httpDestinationDetails() or getHostPort(...) when s.peerService is empty (i.e.,
preserve s.peerService precedence). Apply the same change to the analogous logic
in setDestinationService (the code around httpDestinationDetails(),
getHostPort(), and assignments to targetName/resource) so neither method
overwrites an existing peer.service when populating service target or
destination attributes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 7a7b9cec-2954-494e-8eed-7750f91e4ea3
📒 Files selected for processing (10)
processor/elasticapmprocessor/internal/ecs/ecs_translation.goprocessor/elasticapmprocessor/internal/ecs/ecs_translation_test.goprocessor/elasticapmprocessor/internal/enrichments/config/config.goprocessor/elasticapmprocessor/internal/enrichments/enricher.goprocessor/elasticapmprocessor/internal/enrichments/span.goprocessor/elasticapmprocessor/internal/enrichments/span_test.goprocessor/elasticapmprocessor/processor.goprocessor/elasticapmprocessor/processor_test.goprocessor/elasticapmprocessor/testdata/ecs/elastic_hostname/spans_output.yamlprocessor/elasticapmprocessor/testdata/elastic_span_http/output.yaml
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
processor/elasticapmprocessor/internal/enrichments/span.go (1)
127-129:⚠️ Potential issue | 🟠 MajorPreserve
processor.eventandevent.outcomeuntil enrichment finishes.
ecs.TranslateSpanAttributesstill runs beforeisElasticTransactionandsetEventOutcome, but the translator does not preserve those keys. With translation enabled, explicitly marked transactions can be reclassified as spans, and a pre-setevent.outcome=unknownis lost before Line 467 checks it. Please move translation after enrichment or whitelist those attrs in the translator.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processor/elasticapmprocessor/internal/enrichments/span.go` around lines 127 - 129, The translation of unsupported attributes (cfg.Span.TranslateUnsupportedAttributes / ecs.TranslateSpanAttributes(span.Attributes())) runs too early and can overwrite processor.event and event.outcome before enrichment (isElasticTransaction and setEventOutcome) finishes; move the ecs.TranslateSpanAttributes call so it executes after enrichment and after isElasticTransaction and setEventOutcome have run, or alternatively update the translator to whitelist/preserve the keys "processor.event" and "event.outcome" so they are not removed—locate the span handling flow where isElasticTransaction and setEventOutcome are invoked and relocate or adjust the translation step accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@processor/elasticapmprocessor/internal/enrichments/span.go`:
- Around line 247-249: The current logic unconditionally calls
s.removeHTTPSourceAttributes when
cfg.Span.TranslateUnsupportedAttributes.Enabled and s.isHTTP, which can delete
http.url/url.* before any fallback fields are written; change the flow so
removal only happens when the replacement fields will actually be emitted—either
by checking cfg.Span.DestinationService.Enabled before calling
s.removeHTTPSourceAttributes, or by moving the call to
removeHTTPSourceAttributes to after setDestinationService has run and confirmed
it emitted url.original/destination fields; update the conditions around
cfg.Span.TranslateUnsupportedAttributes.Enabled, s.isHTTP,
s.removeHTTPSourceAttributes, setDestinationService and
cfg.Span.DestinationService.Enabled so we never drop source URL attributes
unless replacements are guaranteed to be created.
---
Duplicate comments:
In `@processor/elasticapmprocessor/internal/enrichments/span.go`:
- Around line 127-129: The translation of unsupported attributes
(cfg.Span.TranslateUnsupportedAttributes /
ecs.TranslateSpanAttributes(span.Attributes())) runs too early and can overwrite
processor.event and event.outcome before enrichment (isElasticTransaction and
setEventOutcome) finishes; move the ecs.TranslateSpanAttributes call so it
executes after enrichment and after isElasticTransaction and setEventOutcome
have run, or alternatively update the translator to whitelist/preserve the keys
"processor.event" and "event.outcome" so they are not removed—locate the span
handling flow where isElasticTransaction and setEventOutcome are invoked and
relocate or adjust the translation step accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 6589773b-6b91-442f-9905-76b28d70d572
📒 Files selected for processing (2)
processor/elasticapmprocessor/internal/enrichments/span.goprocessor/elasticapmprocessor/internal/enrichments/span_test.go
lahsivjar
left a comment
There was a problem hiding this comment.
Have reviewed it briefly but I don't think I will be able to review this completely today - don't get blocked on my review if you get enough to merge.
Can you also run benchmarks to compare how apm processor is doing now vs main?
| UserAgent AttributeConfig `mapstructure:"user_agent"` | ||
| RemoveMessaging AttributeConfig `mapstructure:"remove_messaging"` | ||
| MessageQueueName AttributeConfig `mapstructure:"message_queue_name"` | ||
| TranslateUnsupportedAttributes AttributeConfig `mapstructure:"translate_unsupported_attributes"` |
There was a problem hiding this comment.
I am not sure about these configs, I think we can be a bit opinionated about how translation works because that is the whole purpose of this processor. I have started a slack thread to discuss this: https://elastic.slack.com/archives/C0AED3Y25PU/p1776158286258689
That being said, this is NOT a blocker and PR can be merged without resolving this.
There was a problem hiding this comment.
I agree, it would bring more clarity the ecs paths. I can clean them up in a follow up PR
aelnahas
left a comment
There was a problem hiding this comment.
it is looking good, I had some questions / comments
| return details, true | ||
| } | ||
|
|
||
| func (s *spanEnrichmentContext) buildURLFromComponents() *url.URL { |
There was a problem hiding this comment.
I am not sure if I completely understand it but from what I have been reading in apm-data it seems like https://github.com/elastic/apm-data/blob/26adeeef7f92ba5e01e59fb9e4c735fb8c31b58e/input/otlp/traces.go#L848 there is a fall back on net.peer.port / peer.port, net.peer.name, peer.address, but that logic seems missing here
There was a problem hiding this comment.
net.peer.* have been deprecated since semconv v1.12 and peer.* are even older. I would be in favor of not handling them.
| } | ||
| case string(semconv27.UserAgentOriginalKey): | ||
| s.userAgentOriginal = v.Str() | ||
| s.userAgentOriginalSet = true |
There was a problem hiding this comment.
I think we may also want to handle destination.service.resource, I get the feeling it is being treated as a custom label right but would be worth testing this out
mis read the code its span.destination.service.resource which is already been handled here
There was a problem hiding this comment.
Also it seems like in apm-data https://github.com/elastic/apm-data/blob/26adeeef7f92ba5e01e59fb9e4c735fb8c31b58e/input/otlp/traces.go#L816 span.kind is dropped / skipped . I might be misreading the code but I think this will end up being populated as a custom label label.span_kind?
There was a problem hiding this comment.
Do you know if we also need to populate message.queue.name? I noticed that in apm-data we populate fields like https://github.com/elastic/apm-data/blob/26adeeef7f92ba5e01e59fb9e4c735fb8c31b58e/input/otlp/traces.go#L431 and also :
message_bus.destinationmessaging.operation.type
I think I see them in translation key but just maybe it's worth having a processor test to confirm it is being processed correctly.
There was a problem hiding this comment.
I think we may also want to handle destination.service.resource,
I am not sure I understand this point, it is not an attribute. Can you explain a bit more?
There was a problem hiding this comment.
I misread the code I thought there should be an attributed called destination.service.resource but actually its span.destinat.service.resource . This PR already addresses this.
There was a problem hiding this comment.
@aelnahas regarding your question about message.queue.name, the current implementation uses the MessagingDestinationNameKey (and related semconv attributes) to set the corresponding queue name for transactions and spans in setMessageQueue.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
processor/elasticapmprocessor/internal/enrichments/span_test.go (1)
780-813: Duplicate test case name.
processor_event_transaction_unknown_outcome_preserved_with_translationappears twice (lines 286-319 and 780-813) with identical content. Not a blocker—tests pass—but consider removing the duplicate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processor/elasticapmprocessor/internal/enrichments/span_test.go` around lines 780 - 813, Remove the duplicate table entry for the test case named "processor_event_transaction_unknown_outcome_preserved_with_translation" in the span_test.go test cases; locate the second identical test case (the one that sets span attributes including elasticattr.ProcessorEvent="transaction" and EventOutcome=outcomeUnknown and has spanConfig.TranslateUnsupportedAttributes.Enabled=true) and either delete that duplicate entry or rename it if a distinct scenario is intended so the test names remain unique.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@processor/elasticapmprocessor/internal/enrichments/span_test.go`:
- Around line 780-813: Remove the duplicate table entry for the test case named
"processor_event_transaction_unknown_outcome_preserved_with_translation" in the
span_test.go test cases; locate the second identical test case (the one that
sets span attributes including elasticattr.ProcessorEvent="transaction" and
EventOutcome=outcomeUnknown and has
spanConfig.TranslateUnsupportedAttributes.Enabled=true) and either delete that
duplicate entry or rename it if a distinct scenario is intended so the test
names remain unique.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 28cf985f-3432-485e-9487-3874975ac4a4
📒 Files selected for processing (3)
processor/elasticapmprocessor/internal/ecs/ecs_translation.goprocessor/elasticapmprocessor/internal/enrichments/span.goprocessor/elasticapmprocessor/internal/enrichments/span_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- processor/elasticapmprocessor/internal/ecs/ecs_translation.go
|
Closing this in favour of #1168 |
Summary
Align ECS OTLP span handling in
elasticapmprocessormore closely withapm-databy translating unsupported span attrs tolabels.*/numeric_labels.*, improving HTTP destination and URL derivation, and preserving the right enrichment inputs during ECS processing.What changed
url.original, and default URL schemes more consistentlypeer.serviceforspan.destination.service.*while keeping URL-derivedservice.target.*parityhttp.hostandhttp.user_agentinputs in the ECS span pathNote: branched off #1122 so carried over some unrelated commit history
Test plan
go test ./...fromprocessor/elasticapmprocessor