Skip to content

feat: sanitize utf8 strings with binary in spans#2485

Merged
SkArchon merged 15 commits intomainfrom
milinda/eng-8726-otel-error-traces-export-invalid-utf-8-in-string-field
Feb 23, 2026
Merged

feat: sanitize utf8 strings with binary in spans#2485
SkArchon merged 15 commits intomainfrom
milinda/eng-8726-otel-error-traces-export-invalid-utf-8-in-string-field

Conversation

@SkArchon
Copy link
Copy Markdown
Contributor

@SkArchon SkArchon commented Feb 2, 2026

Sometimes spans can contain bad strings which has non utf8 characters, when they are serialized into protobuf for transport to collectors, protobuf throws an error because it cannot serialize non utf8 characters.

This means that users will loose these spans, in order to handle this we need to sanitize strings. We do this two ways.

For statuses and names, we simply by default opt to sanitize the string. However as there are a lot of attributes, we allow the user to specify if they want to sanitize the span. In order to do this we refactor the current OnEnd function which has redact and add refactor redact to its own transformer and add sanitizeutf8 as another transformer.

Summary by CodeRabbit

  • New Features

    • Configurable attribute-processing pipeline for traces.
    • Optional UTF‑8 sanitization for span attributes with toggleable logging.
    • IP anonymization for traces with selectable redact or hash modes.
    • Span error/status descriptions are sanitized before recording.
  • Configuration

    • New tracing options to enable UTF‑8 sanitization and IP anonymization.
  • Tests

    • Extensive unit and integration tests for attribute processing, sanitization, anonymization, and status sanitization.

Checklist

@github-actions github-actions Bot added the router label Feb 2, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 2, 2026

Walkthrough

Adds a composable OpenTelemetry attribute processor (transformers for redaction, IP anonymization, and UTF‑8 sanitization), wires it into tracer initialization and the test environment, replaces the legacy redact package, and ensures span status descriptions are UTF‑8 sanitized before being recorded.

Changes

Cohort / File(s) Summary
Attribute processor core & tests
router/pkg/trace/attributeprocessor/attributes.go, router/pkg/trace/attributeprocessor/attributes_test.go
New AttributeProcessor SpanProcessor implementation and tests that apply ordered, per-attribute transformers on span end.
IP anonymization transformer & tests
router/pkg/trace/attributeprocessor/redact.go, router/pkg/trace/attributeprocessor/redact_test.go
Adds IPAnonymizationConfig, methods (Hash/Redact) and RedactKeys transformer with unit tests and benchmarks.
UTF-8 sanitization transformer & tests
router/pkg/trace/attributeprocessor/sanitizeutf8.go, router/pkg/trace/attributeprocessor/sanitizeutf8_test.go
Adds SanitizeUTF8 transformer with configurable logging and extensive tests/benchmarks.
Tracer config and wiring
router/pkg/trace/config.go, router/pkg/trace/meter.go
Exposes SanitizeUTF8 and switches IPAnonymization to attributeprocessor types; refactors NewTracerProvider to compose transformers and register AttributeProcessor.
Span status sanitization & tests
router/pkg/trace/utils.go, router/pkg/trace/utils_test.go
Introduces SetSanitizedSpanStatus to sanitize non-UTF8 descriptions before calling span.SetStatus and adds tests.
Removed legacy redact package
router/pkg/trace/redact/attributes.go, router/pkg/trace/redact/attributes_test.go
Deletes the old redact package and its tests in favor of the new attributeprocessor implementation.
Router core updates
router/core/engine_loader_hooks.go, router/core/graphql_prehandler.go, router/core/router.go
Replaces direct span status calls with sanitized helper and updates tracer initialization to use attributeprocessor IPAnonymization and SanitizeUTF8 wiring.
Config schema & defaults
router/pkg/config/config.go, router/pkg/config/config.schema.json, router/pkg/config/testdata/config_defaults.json, router/pkg/config/testdata/config_full.json
Adds SanitizeUTF8Config to tracing config, updates schema and testdata defaults/full config.
Test environment & integration tests
router-tests/testenv/testenv.go, router-tests/telemetry/attribute_processor_test.go
Adds TracingSanitizeUTF8 and IPAnonymization hooks to testenv and an integration test covering sanitization, logging controls, and IP anonymization behaviors.
Module / dependencies
go.mod
Module changes implied by new imports (attributeprocessor, zap, etc.).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding UTF-8 sanitization for spans. It references the specific problem (binary/invalid UTF-8 strings) being solved and identifies where the solution applies (spans).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 2, 2026

Router-nonroot image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-69e3d97ac38118c83cf4fc0802f44a115567dd53-nonroot

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 2, 2026

Codecov Report

❌ Patch coverage is 90.12346% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.21%. Comparing base (602e0ab) to head (8ac3639).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...outer/pkg/trace/attributeprocessor/sanitizeutf8.go 72.22% 3 Missing and 2 partials ⚠️
router/pkg/trace/meter.go 80.00% 1 Missing and 1 partial ⚠️
router/pkg/trace/attributeprocessor/attributes.go 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2485      +/-   ##
==========================================
- Coverage   62.32%   62.21%   -0.11%     
==========================================
  Files         239      241       +2     
  Lines       25456    25499      +43     
==========================================
- Hits        15865    15864       -1     
- Misses       8241     8297      +56     
+ Partials     1350     1338      -12     
Files with missing lines Coverage Δ
router/core/engine_loader_hooks.go 88.95% <100.00%> (-1.23%) ⬇️
router/core/graphql_prehandler.go 83.18% <100.00%> (-0.58%) ⬇️
router/core/router.go 69.50% <100.00%> (+1.02%) ⬆️
router/pkg/config/config.go 80.51% <ø> (ø)
router/pkg/trace/attributeprocessor/redact.go 100.00% <100.00%> (ø)
router/pkg/trace/config.go 75.00% <100.00%> (+2.50%) ⬆️
router/pkg/trace/utils.go 92.85% <100.00%> (+0.54%) ⬆️
router/pkg/trace/attributeprocessor/attributes.go 92.85% <92.85%> (ø)
router/pkg/trace/meter.go 41.32% <80.00%> (+2.29%) ⬆️
...outer/pkg/trace/attributeprocessor/sanitizeutf8.go 72.22% <72.22%> (ø)

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@StarpTech
Copy link
Copy Markdown
Contributor

Please create a proper PR description that explains the situation and solution.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@router-tests/telemetry/attribute_processor_test.go`:
- Around line 149-156: The test currently asserts attributes by indexing the
first snapshot (using exporter.GetSpans().Snapshots() and sn[0].Attributes()),
which is order-dependent and flaky; change the assertions to search across all
snapshots and assert that any span contains the expected attribute (e.g.,
iterate over sn := exporter.GetSpans().Snapshots() and check each snapshot's
Attributes() for sanitizedAttr/unsanitizedAttr using attrKey and sanitizedValue)
or add a small helper (spanHasAttribute) to encapsulate the search; apply the
same change to the other occurrences that reference sn[0] for attribute
assertions (the similar checks around the other test assertions).
- Around line 296-324: The test "IPAnonymization redacts http.client_ip
attribute" currently only checks redaction when http.client_ip exists and can
silently pass if the attribute is missing; update the check to assert that
http.client_ip is present at least once: when iterating over
exporter.GetSpans().Snapshots() and span.Attributes(), track a counter/flag for
occurrences of attribute.Key("http.client_ip"), assert the count is > 0 with
require.Greater/True, and continue to assert each found attr.Value.AsString()
equals "[REDACTED]"; apply the same presence-assertion change to the other test
block (lines 327-357) that verifies hashing so both tests fail if the attribute
is absent.

In `@router/pkg/trace/attributeprocessor/redact.go`:
- Around line 24-52: RedactKeys can leave rFunc nil for unsupported
IPAnonymizationMethod, causing a panic when used; ensure you guard against that
by assigning a safe default (e.g., set rFunc to the Redact behavior before the
switch or add a switch default case that sets rFunc to return "[REDACTED]") or
by returning a no-op transformer when the method is unrecognized; update the
RedactKeys function (symbols: RedactKeys, rFunc, IPAnonymizationMethod, Hash,
Redact, attribute.KeyValue) so rFunc is always non-nil (or short-circuit and
return kv.Value,false) to avoid runtime nil dereference.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8.go`:
- Around line 16-40: SanitizeUTF8 currently dereferences config and runs even
when config is nil or disabled; update SanitizeUTF8 to short‑circuit when config
== nil or config.Enabled == false by returning an identity AttributeTransformer
that returns (kv.Value, false) for all inputs, and only access
config.LogSanitizations after that guard; also only create a nop logger when you
actually need logging (i.e., config.LogSanitizations == true and logger == nil)
to avoid nil derefs and unintended sanitization.

In `@router/pkg/trace/meter.go`:
- Around line 149-162: The IP anonymization branch can pass an empty/unknown
Method into attributeprocessor.RedactKeys which may cause a nil-function panic;
before calling attributeprocessor.RedactKeys when config.IPAnonymization.Enabled
is true, validate config.IPAnonymization.Method and default it to the safe
redact option (e.g., the attributeprocessor redact method constant) if
empty/unknown, then call attributeprocessor.RedactKeys(SensitiveAttributes,
validatedMethod); keep the rest of the transformer-building logic (including
SensitiveAttributes, SanitizeUTF8 and NewAttributeProcessorOption) unchanged.

Comment thread router-tests/telemetry/attribute_processor_test.go
Comment thread router-tests/telemetry/attribute_processor_test.go Outdated
Comment thread router/pkg/trace/attributeprocessor/redact.go Outdated
Comment thread router/pkg/trace/attributeprocessor/sanitizeutf8.go
Comment thread router/pkg/trace/meter.go
@SkArchon SkArchon changed the title feat: refactor and add sanitizeUtf8 processor feat: sanitize utf8 strings with binary in spans Feb 13, 2026
@SkArchon SkArchon marked this pull request as ready for review February 13, 2026 11:12
Comment thread router/pkg/config/config.go Outdated
Comment thread router/pkg/trace/attributeprocessor/attributes_test.go Outdated
Comment thread router/pkg/trace/attributeprocessor/attributes_test.go Outdated
Comment thread router/pkg/trace/attributeprocessor/redact.go
Comment thread router/pkg/trace/attributeprocessor/redact_test.go Outdated
…-error-traces-export-invalid-utf-8-in-string-field
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
router/pkg/trace/attributeprocessor/sanitizeutf8.go (1)

11-14: Note: Type duplication with config.SanitizeUTF8Config.

This struct duplicates config.SanitizeUTF8Config from router/pkg/config/config.go. While this separation may be intentional (config layer vs. runtime layer), consider whether a shared type or type alias could reduce maintenance burden.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8.go` around lines 11 - 14,
The SanitizeUTF8Config struct in sanitizeutf8.go duplicates
config.SanitizeUTF8Config; replace the local definition by reusing the config
package type (either by importing router/pkg/config and referencing
config.SanitizeUTF8Config or by creating a type alias like type
SanitizeUTF8Config = config.SanitizeUTF8Config) and update all uses in this
package (references to SanitizeUTF8Config, constructors or function signatures)
to the single shared type so maintenance stays centralized.
router/pkg/trace/attributeprocessor/redact.go (1)

11-23: Note: Type duplication with router/core/router.go.

The IPAnonymizationMethod type and constants (Hash, Redact) are also defined in router/core/router.go per the relevant code snippets. Consider consolidating these definitions to avoid drift and ensure consistency.

#!/bin/bash
# Verify the duplication and check for any differences
rg -n "IPAnonymizationMethod" --type=go -C2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/redact.go` around lines 11 - 23,
Duplicate type/const definitions for IPAnonymizationMethod (and its values Hash,
Redact) and IPAnonymizationConfig exist; remove the duplicate by centralizing
the type/const in a single package and have the other code import and use that
central definition. Specifically, keep one canonical definition of
IPAnonymizationMethod, IPAnonymizationConfig, Hash and Redact (where they
already exist), delete the duplicate declarations from the other file (remove
the block containing IPAnonymizationMethod, IPAnonymizationConfig, Hash, Redact
in redact.go) and update any references to use the centralized symbol; if
necessary, create or use an existing shared package export for these symbols so
both consumers compile against the single source of truth.
router/pkg/trace/attributeprocessor/sanitizeutf8_test.go (1)

132-174: Benchmarks mutate attributes in-place, so later iterations don’t exercise sanitization.
Consider resetting/copying the base attributes each iteration to keep results representative.

♻️ Proposed benchmark fix (reset attributes each iteration)
-	s := sanitizeSpan{attrs: attrs}
-	ac := NewAttributeProcessor(SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil))
-	ctx := context.Background()
+	scratch := make([]attribute.KeyValue, len(attrs))
+	ac := NewAttributeProcessor(SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil))
+	ctx := context.Background()

 	return func(b *testing.B) {
 		b.ReportAllocs()
 		b.ResetTimer()
 		for i := 0; i < b.N; i++ {
-			ac.OnStart(ctx, s)
-			ac.OnEnd(s)
+			copy(scratch, attrs)
+			s := sanitizeSpan{attrs: scratch}
+			ac.OnStart(ctx, s)
+			ac.OnEnd(s)
 		}
 	}
-	s := sanitizeSpan{attrs: attrs}
-	ac := NewAttributeProcessor(SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil))
-	ctx := context.Background()
+	scratch := make([]attribute.KeyValue, len(attrs))
+	ac := NewAttributeProcessor(SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil))
+	ctx := context.Background()

 	b.ReportAllocs()
 	b.ResetTimer()
 	for i := 0; i < b.N; i++ {
-		ac.OnStart(ctx, s)
-		ac.OnEnd(s)
+		copy(scratch, attrs)
+		s := sanitizeSpan{attrs: scratch}
+		ac.OnStart(ctx, s)
+		ac.OnEnd(s)
 	}

Also applies to: 177-203

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8_test.go` around lines 132 -
174, The benchmark currently mutates the shared attrs slice (created in
benchSanitizeUTF8OnEnd) so later iterations skip real work; modify the returned
closure from benchSanitizeUTF8OnEnd (used by BenchmarkSanitizeUTF8OnEnd) to
reset or copy the base attributes each iteration before calling
ac.OnStart/ac.OnEnd: e.g., keep an immutable template slice or recreate
sanitizeSpan with a fresh attrs slice inside the loop, or perform a
per-iteration deep copy of attrs into a new sanitizeSpan so sanitization runs
every iteration; update both benchSanitizeUTF8OnEnd and the analogous bench
function around lines 177-203.
router/pkg/trace/attributeprocessor/redact_test.go (1)

233-290: Benchmarks reuse mutated attributes, skewing redaction timing.
Reset/copy the base attributes each iteration to keep the benchmark honest.

♻️ Proposed benchmark fix (reset attributes each iteration)
 	return func(b *testing.B) {
 		transformer, err := RedactKeys(keys, method)
 		if err != nil {
 			b.Fatal(err)
 		}
-		s := rwSpan{attrs: attrs}
+		scratch := make([]attribute.KeyValue, len(attrs))
 		ac := NewAttributeProcessor(transformer)
 		ctx := context.Background()

 		b.ReportAllocs()
 		b.ResetTimer()
 		for i := 0; i < b.N; i++ {
-			ac.OnStart(ctx, s)
-			ac.OnEnd(s)
+			copy(scratch, attrs)
+			s := rwSpan{attrs: scratch}
+			ac.OnStart(ctx, s)
+			ac.OnEnd(s)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/redact_test.go` around lines 233 - 290,
The benchmark benchRedactOnEnd reuses and mutates the same attrs slice across
iterations which skews timing; fix it by creating a fresh copy of the base
attributes for each iteration (e.g., clone attrs into a new slice and construct
a new rwSpan with that slice) before calling ac.OnStart and ac.OnEnd inside the
for i := 0; i < b.N loop so each iteration operates on unmodified data; keep the
transformer creation (RedactKeys) and processor (NewAttributeProcessor) outside
the loop to avoid measuring setup time.
router/pkg/trace/attributeprocessor/attributes_test.go (1)

185-245: Benchmark reuses mutated attributes, so later iterations don’t exercise transformations.
Consider resetting/copying the base attributes per-iteration.

♻️ Proposed benchmark fix (reset attributes each iteration)
-	s := benchSpan{attrs: attrs}
+	scratch := make([]attribute.KeyValue, len(attrs))
 	redactTransformer, err := RedactKeys(keys, Redact)
 	if err != nil {
 		panic(err)
 	}
 	ac := NewAttributeProcessor(
 		redactTransformer,
 		SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil),
 	)
 	ctx := context.Background()

 	return func(b *testing.B) {
 		b.ReportAllocs()
 		b.ResetTimer()
 		for i := 0; i < b.N; i++ {
-			ac.OnStart(ctx, s)
-			ac.OnEnd(s)
+			copy(scratch, attrs)
+			s := benchSpan{attrs: scratch}
+			ac.OnStart(ctx, s)
+			ac.OnEnd(s)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/attributes_test.go` around lines 185 -
245, The benchmark reuses and mutates the same attrs slice across iterations so
later runs don't exercise transformations; inside benchCombinedTransformers (the
returned func) copy or reset the original attrs into s.attrs each iteration
before calling ac.OnStart/ac.OnEnd (e.g., keep a baseline slice created outside
the returned func and shallow-copy its elements into s.attrs inside the loop) so
each iteration gets fresh attributes to transform; ensure you copy the
attribute.KeyValue slice (and values if you mutate them) rather than reusing the
same slice referenced by benchSpan, and then run ac.OnStart(ctx, s) /
ac.OnEnd(s) as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@router-tests/telemetry/attribute_processor_test.go`:
- Around line 57-60: The test currently checks sn[0].Attributes() for the
sanitized attribute (sanitizedAttr), which assumes span order; instead iterate
over the snapshot slice sn and assert that at least one span's Attributes()
contains attribute.String(attrKey, sanitizedValue). Update the assertions in the
cases referencing sn[0] (including the other instances around lines noted) to
loop through sn, break on first match, and fail the test if no span contains the
expected attribute.

---

Nitpick comments:
In `@router/pkg/trace/attributeprocessor/attributes_test.go`:
- Around line 185-245: The benchmark reuses and mutates the same attrs slice
across iterations so later runs don't exercise transformations; inside
benchCombinedTransformers (the returned func) copy or reset the original attrs
into s.attrs each iteration before calling ac.OnStart/ac.OnEnd (e.g., keep a
baseline slice created outside the returned func and shallow-copy its elements
into s.attrs inside the loop) so each iteration gets fresh attributes to
transform; ensure you copy the attribute.KeyValue slice (and values if you
mutate them) rather than reusing the same slice referenced by benchSpan, and
then run ac.OnStart(ctx, s) / ac.OnEnd(s) as before.

In `@router/pkg/trace/attributeprocessor/redact_test.go`:
- Around line 233-290: The benchmark benchRedactOnEnd reuses and mutates the
same attrs slice across iterations which skews timing; fix it by creating a
fresh copy of the base attributes for each iteration (e.g., clone attrs into a
new slice and construct a new rwSpan with that slice) before calling ac.OnStart
and ac.OnEnd inside the for i := 0; i < b.N loop so each iteration operates on
unmodified data; keep the transformer creation (RedactKeys) and processor
(NewAttributeProcessor) outside the loop to avoid measuring setup time.

In `@router/pkg/trace/attributeprocessor/redact.go`:
- Around line 11-23: Duplicate type/const definitions for IPAnonymizationMethod
(and its values Hash, Redact) and IPAnonymizationConfig exist; remove the
duplicate by centralizing the type/const in a single package and have the other
code import and use that central definition. Specifically, keep one canonical
definition of IPAnonymizationMethod, IPAnonymizationConfig, Hash and Redact
(where they already exist), delete the duplicate declarations from the other
file (remove the block containing IPAnonymizationMethod, IPAnonymizationConfig,
Hash, Redact in redact.go) and update any references to use the centralized
symbol; if necessary, create or use an existing shared package export for these
symbols so both consumers compile against the single source of truth.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8_test.go`:
- Around line 132-174: The benchmark currently mutates the shared attrs slice
(created in benchSanitizeUTF8OnEnd) so later iterations skip real work; modify
the returned closure from benchSanitizeUTF8OnEnd (used by
BenchmarkSanitizeUTF8OnEnd) to reset or copy the base attributes each iteration
before calling ac.OnStart/ac.OnEnd: e.g., keep an immutable template slice or
recreate sanitizeSpan with a fresh attrs slice inside the loop, or perform a
per-iteration deep copy of attrs into a new sanitizeSpan so sanitization runs
every iteration; update both benchSanitizeUTF8OnEnd and the analogous bench
function around lines 177-203.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8.go`:
- Around line 11-14: The SanitizeUTF8Config struct in sanitizeutf8.go duplicates
config.SanitizeUTF8Config; replace the local definition by reusing the config
package type (either by importing router/pkg/config and referencing
config.SanitizeUTF8Config or by creating a type alias like type
SanitizeUTF8Config = config.SanitizeUTF8Config) and update all uses in this
package (references to SanitizeUTF8Config, constructors or function signatures)
to the single shared type so maintenance stays centralized.

@SkArchon SkArchon requested a review from endigma February 19, 2026 08:40
@SkArchon
Copy link
Copy Markdown
Contributor Author

@endigma let me know if all good

Copy link
Copy Markdown
Member

@endigma endigma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, modernize nit: the benchmarks should use b.Loop now

Comment thread router/pkg/trace/attributeprocessor/attributes_test.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
router/pkg/trace/attributeprocessor/sanitizeutf8_test.go (1)

197-201: Same testing.B.Loop() compatibility check applies here.

This loop has the same version dependency as the previous benchmark.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8_test.go` around lines 197 -
201, The benchmark uses testing.B.Loop(), which is only available in newer Go
versions; replace the Loop() usage with the standard for i := 0; i < b.N; i++
pattern so the benchmark remains compatible across Go versions — update the loop
that calls ac.OnStart(ctx, s) and ac.OnEnd(s) to run inside for i := 0; i < b.N;
i++ while keeping b.ReportAllocs() and the same ctx, s, and ac calls.
🧹 Nitpick comments (2)
router/pkg/trace/attributeprocessor/attributes_test.go (1)

38-49: Unused test helper function.

testAttributesAfterCreation is defined but not used anywhere in this test file. Consider removing it if it's not needed, or add a test case that exercises the "set attributes after creation" scenario to ensure that path is also covered.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/attributes_test.go` around lines 38 - 49,
The helper function testAttributesAfterCreation is unused; either remove this
function entirely or add a test that calls testAttributesAfterCreation to
exercise setting attributes after span creation and assert the recorded
attributes via attrRecorder (it uses trace.NewTracerProvider with opt,
trace.WithSpanProcessor(r), tracer := tp.Tracer("testAttributes"), tracer.Start
and s.SetAttributes); choose one of these fixes so the file no longer contains
dead code or so the "set attributes after creation" path is covered by a test.
router/pkg/trace/attributeprocessor/sanitizeutf8_test.go (1)

13-119: Strengthen sanitization tests by asserting removal of invalid values.

require.Contains alone would pass even if the sanitizer keeps the original invalid attribute (or duplicates it). Add require.NotContains to ensure the invalid value is removed/replaced.

Proposed tightening of assertions
@@
 		attributes := testAttributes(t.Context(), NewAttributeProcessorOption(SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil)), invalidStr)
 		require.Contains(t, attributes, expected)
+		require.NotContains(t, attributes, invalidStr)
@@
 		attributes := testAttributes(t.Context(), NewAttributeProcessorOption(SanitizeUTF8(&SanitizeUTF8Config{Enabled: true}, nil)), mixedStr)
 		require.Contains(t, attributes, expected)
+		require.NotContains(t, attributes, mixedStr)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8_test.go` around lines 13 -
119, The tests currently only use require.Contains which would still pass if the
sanitizer left the original invalid attribute; update the relevant subtests in
TestSanitizeUTF8 to also assert the invalid original is removed by adding
require.NotContains checks: for the "invalid UTF-8 bytes" case assert
NotContains(attributes, invalidStr); for "mixed valid and invalid UTF-8" assert
NotContains(attributes, mixedStr); for "multiple separate invalid sequences"
assert NotContains(attributes, mixedStr) (the mixedBytes/mixedStr variable
there); keep the "without sanitizer" case as-is to assert the original remains.
Use the existing variable names (invalidStr, mixedStr, etc.) so the assertions
locate the exact original attribute to confirm replacement.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@router/pkg/trace/attributeprocessor/sanitizeutf8_test.go`:
- Around line 197-201: The benchmark uses testing.B.Loop(), which is only
available in newer Go versions; replace the Loop() usage with the standard for i
:= 0; i < b.N; i++ pattern so the benchmark remains compatible across Go
versions — update the loop that calls ac.OnStart(ctx, s) and ac.OnEnd(s) to run
inside for i := 0; i < b.N; i++ while keeping b.ReportAllocs() and the same ctx,
s, and ac calls.

---

Nitpick comments:
In `@router/pkg/trace/attributeprocessor/attributes_test.go`:
- Around line 38-49: The helper function testAttributesAfterCreation is unused;
either remove this function entirely or add a test that calls
testAttributesAfterCreation to exercise setting attributes after span creation
and assert the recorded attributes via attrRecorder (it uses
trace.NewTracerProvider with opt, trace.WithSpanProcessor(r), tracer :=
tp.Tracer("testAttributes"), tracer.Start and s.SetAttributes); choose one of
these fixes so the file no longer contains dead code or so the "set attributes
after creation" path is covered by a test.

In `@router/pkg/trace/attributeprocessor/sanitizeutf8_test.go`:
- Around line 13-119: The tests currently only use require.Contains which would
still pass if the sanitizer left the original invalid attribute; update the
relevant subtests in TestSanitizeUTF8 to also assert the invalid original is
removed by adding require.NotContains checks: for the "invalid UTF-8 bytes" case
assert NotContains(attributes, invalidStr); for "mixed valid and invalid UTF-8"
assert NotContains(attributes, mixedStr); for "multiple separate invalid
sequences" assert NotContains(attributes, mixedStr) (the mixedBytes/mixedStr
variable there); keep the "without sanitizer" case as-is to assert the original
remains. Use the existing variable names (invalidStr, mixedStr, etc.) so the
assertions locate the exact original attribute to confirm replacement.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1adc217 and 5a3ec21.

📒 Files selected for processing (2)
  • router/pkg/trace/attributeprocessor/attributes_test.go
  • router/pkg/trace/attributeprocessor/sanitizeutf8_test.go

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
router/core/router.go (1)

829-833: Type conversion assumes matching enum values.

The cast from core.IPAnonymizationMethod to attributeprocessor.IPAnonymizationMethod at line 831 relies on both types using identical underlying string values ("hash", "redact"). This is acceptable but creates an implicit coupling between the two packages.

Consider adding a compile-time assertion or a brief comment noting this dependency to prevent future divergence.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/core/router.go` around lines 829 - 833, The code currently casts
core.IPAnonymizationMethod to attributeprocessor.IPAnonymizationMethod (used
when constructing IPAnonymization) which implicitly assumes both enum string
values stay identical; replace the direct cast with a small explicit converter
function (e.g., convertIPAnonymizationMethod(core.IPAnonymizationMethod)
attributeprocessor.IPAnonymizationMethod) that switches on known cases ("hash",
"redact") and returns a safe default for unknown values, and add a short comment
near the IPAnonymization / IPAnonymizationMethod usage documenting the coupling
to prevent silent divergence between the two packages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@router/core/router.go`:
- Around line 829-833: The code currently casts core.IPAnonymizationMethod to
attributeprocessor.IPAnonymizationMethod (used when constructing
IPAnonymization) which implicitly assumes both enum string values stay
identical; replace the direct cast with a small explicit converter function
(e.g., convertIPAnonymizationMethod(core.IPAnonymizationMethod)
attributeprocessor.IPAnonymizationMethod) that switches on known cases ("hash",
"redact") and returns a safe default for unknown values, and add a short comment
near the IPAnonymization / IPAnonymizationMethod usage documenting the coupling
to prevent silent divergence between the two packages.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a3ec21 and 035a5fd.

📒 Files selected for processing (2)
  • router-tests/testenv/testenv.go
  • router/core/router.go

@SkArchon SkArchon merged commit 43aa77e into main Feb 23, 2026
29 checks passed
@SkArchon SkArchon deleted the milinda/eng-8726-otel-error-traces-export-invalid-utf-8-in-string-field branch February 23, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants