[clickhouse] Add handling for complex attributes to ClickHouse storage#7627
Conversation
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
| if err != nil { | ||
| break | ||
| } |
There was a problem hiding this comment.
@yurishkuro How should we handle this? It's on the write path. Should we add a warning for this?
There was a problem hiding this comment.
yes I would still add the warning, it's better than silently ignoring the error. another alternative is to log it, but I prefer storing it int the span.
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (86.95%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #7627 +/- ##
==========================================
- Coverage 96.59% 96.53% -0.07%
==========================================
Files 384 384
Lines 19404 19464 +60
==========================================
+ Hits 18744 18790 +46
- Misses 477 489 +12
- Partials 183 185 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Metrics Comparison SummaryTotal changes across all snapshots: 53 Detailed changes per snapshotsummary_metrics_snapshot_cassandra📊 Metrics Diff SummaryTotal Changes: 53
🆕 Added Metrics
View diff sample+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="+Inf",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="0",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="100",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="1000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="25",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
...View diff sample+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="+Inf",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.005",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.01",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.025",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.05",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.075",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_request_duration_seconds{http_request_method="GET",http_response_status_code="503",le="0.1",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
...View diff sample+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="+Inf",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="0",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="100",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="1000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="10000",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
+http_server_response_body_size_bytes{http_request_method="GET",http_response_status_code="503",le="25",network_protocol_name="http",network_protocol_version="1.1",otel_scope_name="go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",otel_scope_schema_url="",otel_scope_version="0.63.0",server_address="localhost",server_port="13133",url_scheme="http"}
... |
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
| case pcommon.ValueTypeSlice: | ||
| key := "@slice@" + k | ||
| m := &xpdata.JSONMarshaler{} | ||
| b, err := m.MarshalValue(v) | ||
| if err != nil { | ||
| break | ||
| } | ||
| encoded := base64.StdEncoding.EncodeToString(b) | ||
| out.ComplexKeys = append(out.ComplexKeys, key) | ||
| out.ComplexValues = append(out.ComplexValues, encoded) |
There was a problem hiding this comment.
Silent data loss on marshal failure. When m.MarshalValue(v) returns an error, the code uses break which only exits the switch statement but continues the Range loop. This causes the slice attribute to be silently dropped with no error logging or warning added to the span.
Impact: Attributes will be lost in production without any indication to users or operators.
Fix: Add error handling similar to the from.go pattern:
case pcommon.ValueTypeSlice:
key := "@slice@" + k
m := &xpdata.JSONMarshaler{}
b, err := m.MarshalValue(v)
if err != nil {
// Log error or add warning to span
// For now, skip this attribute
return true
}
encoded := base64.StdEncoding.EncodeToString(b)
out.ComplexKeys = append(out.ComplexKeys, key)
out.ComplexValues = append(out.ComplexValues, encoded)Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
| case pcommon.ValueTypeMap: | ||
| key := "@map@" + k | ||
| m := &xpdata.JSONMarshaler{} | ||
| b, err := m.MarshalValue(v) | ||
| if err != nil { | ||
| break | ||
| } | ||
| encoded := base64.StdEncoding.EncodeToString(b) | ||
| out.ComplexKeys = append(out.ComplexKeys, key) | ||
| out.ComplexValues = append(out.ComplexValues, encoded) |
There was a problem hiding this comment.
Silent data loss on marshal failure. When m.MarshalValue(v) returns an error for map attributes, the code uses break which only exits the switch statement but continues the Range loop. This causes the map attribute to be silently dropped with no error logging or warning.
Impact: Map attributes will be lost in production without any indication to users or operators.
Fix: Add error handling similar to the from.go pattern:
case pcommon.ValueTypeMap:
key := "@map@" + k
m := &xpdata.JSONMarshaler{}
b, err := m.MarshalValue(v)
if err != nil {
// Log error or add warning to span
// For now, skip this attribute
return true
}
encoded := base64.StdEncoding.EncodeToString(b)
out.ComplexKeys = append(out.ComplexKeys, key)
out.ComplexValues = append(out.ComplexValues, encoded)| case pcommon.ValueTypeMap: | |
| key := "@map@" + k | |
| m := &xpdata.JSONMarshaler{} | |
| b, err := m.MarshalValue(v) | |
| if err != nil { | |
| break | |
| } | |
| encoded := base64.StdEncoding.EncodeToString(b) | |
| out.ComplexKeys = append(out.ComplexKeys, key) | |
| out.ComplexValues = append(out.ComplexValues, encoded) | |
| case pcommon.ValueTypeMap: | |
| key := "@map@" + k | |
| m := &xpdata.JSONMarshaler{} | |
| b, err := m.MarshalValue(v) | |
| if err != nil { | |
| // Log error or add warning to span | |
| // For now, skip this attribute | |
| return true | |
| } | |
| encoded := base64.StdEncoding.EncodeToString(b) | |
| out.ComplexKeys = append(out.ComplexKeys, key) | |
| out.ComplexValues = append(out.ComplexValues, encoded) |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
| vm := pcommon.NewValueMap() | ||
| vm.Map().PutStr("key", "value") | ||
| m := &xpdata.JSONMarshaler{} | ||
| buf, err := m.MarshalValue(vm) | ||
| require.NoError(t, err) | ||
| encodedMap := base64.StdEncoding.EncodeToString(buf) | ||
|
|
||
| vs := pcommon.NewValueSlice() | ||
| vs.Slice().AppendEmpty().SetInt(1) | ||
| vs.Slice().AppendEmpty().SetInt(2) | ||
| vs.Slice().AppendEmpty().SetInt(3) | ||
| buf, err = m.MarshalValue(vs) | ||
| require.NoError(t, err) | ||
| encodedSlice := base64.StdEncoding.EncodeToString(buf) |
There was a problem hiding this comment.
@yurishkuro would it better to test using the raw json string here?
| } | ||
| k := strings.TrimPrefix(storedAttrs.ComplexKeys[i], "@bytes@") | ||
| attrs.PutEmptyBytes(k).FromRaw(decoded) | ||
| case strings.HasPrefix(storedAttrs.ComplexKeys[i], "@slice@"): |
There was a problem hiding this comment.
Aside from prefix the code is identical to next case:, can move to shared helper
There was a problem hiding this comment.
@yurishkuro The slice and map are added differently
attrs.PutEmptySlice(k).FromRaw(val.Slice().AsRaw())vs.
attrs.PutEmptyMap(k).FromRaw(val.Map().AsRaw())The warning messages are slightly different as well.
| // | ||
| // - pcommon.ValueTypeSlice: | ||
| // Represents an OTLP slice (array). The value is first serialized to JSON, then | ||
| // Base64-encoded before storage. Keys for this type are prefixed with `@slice@`. |
There was a problem hiding this comment.
Why do we need to encode json as base64? I understand we need it for bytes as they may be unprintable, but json can be stored directly as string.
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>
jaegertracing#7627) ## Which problem is this PR solving? - Towards jaegertracing#7134 and jaegertracing#7135 ## Description of the changes - This PR adds support for map and slice attributes in ClickHouse storage. It does so by leveraging the new API added in open-telemetry/opentelemetry-collector#13945 to Marshal the attributes to JSON and then storing them as Base64 encoded strings ## How was this change tested? - CI / unit tests ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `npm run lint` and `npm run test` --------- Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com> Signed-off-by: SoumyaRaikwar <somuraik@gmail.com>
Which problem is this PR solving?
Description of the changes
How was this change tested?
Checklist
jaeger:make lint testjaeger-ui:npm run lintandnpm run test