[processor/elasticapmprocessor] align ECS OTLP metrics with apm-data#1122
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughReworks ECS attribute translation from predicate/allowlist helpers to explicit per-key switch logic for resources and logs, and adds exported TranslateMetricDataPointAttributes for metric datapoints. Adds EnrichMetricDataPoints and updates EnrichMetrics to apply datapoint-level enrichment. Introduces config field Metric.TranslateUnsupportedAttributes and enables it by default for the non-intake ECS metric enricher (disabled for intake). Changes sanitize existing label keys in-place, truncate selected preserved string attributes, and move remaining keys into ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@processor/elasticapmprocessor/processor.go`:
- Around line 216-222: The code creates two different configs (ecsEnricherConfig
and intakeECSEnricherConfig) but still selects a single enricher from
resourceMetrics.At(0) and applies it to the whole pmetric.Metrics batch; change
the enrichment flow so you either (A) select and apply the correct enricher per
ResourceMetrics (iterate over each ResourceMetrics and pick the enricher using
its Resource/Schema/Origin) or (B) partition the incoming pmetric.Metrics by
origin (OTLP vs intake), build an enricher per partition from
ecsEnricherConfig/intakeECSEnricherConfig, and run enrichment separately on each
partition; update the code paths that currently call the single-batch enricher
(the selection that reads resourceMetrics.At(0) and the apply step that operates
on pmetric.Metrics) so enrichment is done per-resource or per-partition,
ensuring enrichments/enricher.go is invoked with the matching config for each
ResourceMetrics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5c4554d2-48c3-4c1b-a05d-957225a3b617
📒 Files selected for processing (12)
processor/elasticapmprocessor/internal/ecs/ecs_translation.goprocessor/elasticapmprocessor/internal/ecs/ecs_translation_test.goprocessor/elasticapmprocessor/internal/enrichments/config/config.goprocessor/elasticapmprocessor/internal/enrichments/enricher.goprocessor/elasticapmprocessor/internal/enrichments/metric.goprocessor/elasticapmprocessor/internal/enrichments/metric_test.goprocessor/elasticapmprocessor/internal/enrichments/resource.goprocessor/elasticapmprocessor/processor.goprocessor/elasticapmprocessor/processor_test.goprocessor/elasticapmprocessor/testdata/ecs/elastic_hostname/metrics_output.yamlprocessor/elasticapmprocessor/testdata/skip_enrichment/metrics_false_ecs_output.yamlprocessor/elasticapmprocessor/testdata/skip_enrichment/metrics_true_ecs_output.yaml
| ecsEnricherConfig.Resource.DefaultServiceLanguage.Enabled = true | ||
| ecsEnricherConfig.Metric.TranslateUnsupportedAttributes.Enabled = true | ||
|
|
||
| intakeECSEnricherConfig := ecsEnricherConfig | ||
| intakeECSEnricherConfig.Resource.HostOSType.Enabled = false | ||
| intakeECSEnricherConfig.Resource.DefaultServiceLanguage.Enabled = false | ||
| intakeECSEnricherConfig.Metric.TranslateUnsupportedAttributes.Enabled = false |
There was a problem hiding this comment.
The new ECS metric config split is unsafe with batch-wide enricher selection.
With Lines 216-222, OTLP and intake ECS metrics now use different enrichment behavior. Lines 241-247 still choose one enricher from resourceMetrics.At(0), and Lines 271-272 apply it to the entire pmetric.Metrics; processor/elasticapmprocessor/internal/enrichments/enricher.go:92-112 confirms the full batch is processed with that single config. Any mixed ECS batch will therefore mis-enrich later resources (OTLP fallback skipped, or intake datapoints relabeled). Please split by origin or enrich per ResourceMetrics.
As per coding guidelines, "Highlight only issues that could cause runtime errors, data loss, or severe maintainability issues."
Also applies to: 241-247, 271-272
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@processor/elasticapmprocessor/processor.go` around lines 216 - 222, The code
creates two different configs (ecsEnricherConfig and intakeECSEnricherConfig)
but still selects a single enricher from resourceMetrics.At(0) and applies it to
the whole pmetric.Metrics batch; change the enrichment flow so you either (A)
select and apply the correct enricher per ResourceMetrics (iterate over each
ResourceMetrics and pick the enricher using its Resource/Schema/Origin) or (B)
partition the incoming pmetric.Metrics by origin (OTLP vs intake), build an
enricher per partition from ecsEnricherConfig/intakeECSEnricherConfig, and run
enrichment separately on each partition; update the code paths that currently
call the single-batch enricher (the selection that reads resourceMetrics.At(0)
and the apply step that operates on pmetric.Metrics) so enrichment is done
per-resource or per-partition, ensuring enrichments/enricher.go is invoked with
the matching config for each ResourceMetrics.
| if isAggregatedMetricDataPointAttributes(attributes) { | ||
| return | ||
| } | ||
| if cfg.Metric.TranslateUnsupportedAttributes.Enabled { |
There was a problem hiding this comment.
What's the purpose of TranslateUnsupportedAttributes, when will we use this?
| return true | ||
| } | ||
|
|
||
| switch k { |
There was a problem hiding this comment.
I don't see telemetry.sdk.elastic_export_timestamp (ref), is it intentionally not handled?
| string(semconv.NetworkCarrierNameKey), | ||
| string(semconv.NetworkConnectionSubtypeKey), | ||
| string(semconv.NetworkConnectionTypeKey), | ||
| string(semconv.ProcessExecutableNameKey), |
There was a problem hiding this comment.
apm-data doesn't seem to handle process executable name and process parent pid, any reason we are handling them here?
There was a problem hiding this comment.
this was already preserved by the previous ECS allowlist so I kept that behavior
| string(semconv.ProcessExecutableNameKey), | ||
| string(semconv.ProcessParentPIDKey), | ||
| string(semconv.ProcessPIDKey), | ||
| string(semconv.ServiceNameKey), |
There was a problem hiding this comment.
Hmm, I think this is also handled differently here, shouldn't we be sanitizing this as we do in apm-data: https://github.com/elastic/apm-data/blob/7da222dcc0320f9c812c5d72f65f830c838aae11/input/otlp/metadata.go#L54
There was a problem hiding this comment.
The sanitization still happens, but in EnrichResource() (with sanitizeServiceName) rather than in TranslateResourceMetadata()
A lot of the implementation differences are because we mutate the original OTLP resource in place, instead of translating it into a separate APM model like apm-data does. Using config flags to gate the behaviour also makes the location of the logic different even though the outcome for service.name is still the same.
| elasticattr.CloudProjectName, | ||
| elasticattr.DataStreamDataset, | ||
| elasticattr.DataStreamNamespace, | ||
| elasticattr.DataStreamType, |
There was a problem hiding this comment.
Datastream type doesn't seem to be handled in apm-data(ref), any reason we are handling it here?
There was a problem hiding this comment.
this predates my refactor, it was already part of the existing ECS allowlist so I kept that behavior but I think it's inclusion is intentional.apm-data can derive stream type later in its event pipeline but for the collector/exporter ECS path we rely on data_stream.* in the attributes for routing/index selection before the final document is emitted.
lahsivjar
left a comment
There was a problem hiding this comment.
Approving it to unblock. We might need to revisit some stuff later.
Summary
elasticapmprocessorECS OTLP metric handling withapm-databy translating datapoint attributes tolabels.*/numeric_labels.*, defaulting missing attributes (e.g telemetry.sdk.language), matching truncation behavior for preserved resource and metric fields etcservice.framework.nameandservice.framework.versionfor raw OTLP ECS metricsTest plan
go test ./... in processor/elasticapmprocessor