-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: sanitize structured metadata during ingestion in the distributor #15141
Conversation
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Some better benchmark results:
To break things down a bit more; if we compare the current code if the request had SM vs now with the checks but the request has SM but nothing is invalid.
Next, no checks with SM vs the various cases of checks + invalid SM
|
Signed-off-by: Callum Styan <[email protected]>
we end up adding the detected log level on every iteration to the one original write request and end up with N-1 detected log level labels in the SM Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Did you find why we have multiple detected_level ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I did, it was just related to reuse of the same I modified the benchmark to create a new write request for each iteration of |
Still working here, want to see if I can get the benchmark results to be any better.
The benchmark itself needs some love as well, if you use it with
-count
you can get a segmentation fault/nil pointer dereference panic in the mock ingester code.Just the addition of the two checks slows down
distributor.Push
significantly, and also it looks to me like when the log line doesn't contain a log field for otlp somehow we might be adding theunknown
value multiple times?[{detected_level unknown} {detected_level unknown} {detected_level unknown} {detected_level unknown}]
Very basic benchmark results so far:
f65ab130725dc25c9d546fa4d5fb1e4a6d26009e
makeWriteRequestWithLabels
to add structured metadatadistributor.Push
to check and sanitize the label valueNote that this is worst case scenario, since every entry in the write request has structured metadata that needs to be checked