-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add a metric ingestion time SM sanitization #15222
Conversation
normalized = otlptranslate.NormalizeLabel(structuredMetadata[i].Name) | ||
if normalized != structuredMetadata[i].Name { | ||
structuredMetadata[i].Name = normalized | ||
d.tenantPushSanitizedStructuredMetadata.WithLabelValues(tenantID).Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also add this for value sanitisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, this just got missed in my git add --patch
and then I saw the test failing as you posted your comment, pushed the value inc() call
ingestion time. Signed-off-by: Callum Styan <[email protected]>
56f2cf7
to
91cabf9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Signed-off-by: Callum Styan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Not sure we need a metrics for that but why not.
Adding a per-tenant metric for when a tenant has structured metadata sanitized at ingestion time. This would be less spammy than a log line, and while the
tenant
label could be high cardinality it's unlikely to be as the % of users using SM and sending SM that has invalid characters is likely low.Some of the discussion around #15113 has been that the query time sanitization of the SM name and value (value would be sanitized in that PR, name sanitization is already in place) is that query time sanitization is expensive, and we'd like to move that functionality to be on a per-tenant basis.
The goal here is to inform us which users have been sending invalid SM so that we modify the query time SM sanitization code blocks so that they're only run on a per-tenant basis based on per-tenant overrides. With this metric in place we would be able to have a good idea ahead of time of which users have been sending invalid SM before we make the swap over so that we can set the per-tenant config in advance of the rollout.