Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add a metric ingestion time SM sanitization #15222

Merged
merged 2 commits into from
Dec 4, 2024

Conversation

cstyan
Copy link
Contributor

@cstyan cstyan commented Dec 3, 2024

Adding a per-tenant metric for when a tenant has structured metadata sanitized at ingestion time. This would be less spammy than a log line, and while the tenant label could be high cardinality it's unlikely to be as the % of users using SM and sending SM that has invalid characters is likely low.

Some of the discussion around #15113 has been that the query time sanitization of the SM name and value (value would be sanitized in that PR, name sanitization is already in place) is that query time sanitization is expensive, and we'd like to move that functionality to be on a per-tenant basis.

The goal here is to inform us which users have been sending invalid SM so that we modify the query time SM sanitization code blocks so that they're only run on a per-tenant basis based on per-tenant overrides. With this metric in place we would be able to have a good idea ahead of time of which users have been sending invalid SM before we make the swap over so that we can set the per-tenant config in advance of the rollout.

@cstyan cstyan requested a review from a team as a code owner December 3, 2024 04:12
normalized = otlptranslate.NormalizeLabel(structuredMetadata[i].Name)
if normalized != structuredMetadata[i].Name {
structuredMetadata[i].Name = normalized
d.tenantPushSanitizedStructuredMetadata.WithLabelValues(tenantID).Inc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add this for value sanitisation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, this just got missed in my git add --patch and then I saw the test failing as you posted your comment, pushed the value inc() call

@cstyan cstyan force-pushed the callum-sm-sanitize-metric branch from 56f2cf7 to 91cabf9 Compare December 3, 2024 04:19
Copy link
Contributor

@ashwanthgoli ashwanthgoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

pkg/distributor/distributor_test.go Outdated Show resolved Hide resolved
Signed-off-by: Callum Styan <[email protected]>
@pull-request-size pull-request-size bot added size/M and removed size/S labels Dec 3, 2024
Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Not sure we need a metrics for that but why not.

@cstyan cstyan merged commit e9d0c3e into main Dec 4, 2024
60 checks passed
@cstyan cstyan deleted the callum-sm-sanitize-metric branch December 4, 2024 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants