feat(loki.secretfilter): Add sampling for secretfilter entries#5663
Conversation
|
💻 Deploy preview deleted (feat(loki.secretfilter): Add sampling for secretfilter entries). |
|
Is this two PRs? One to refactor the unit tests, and a second to add the sampling feature? |
Good point, thanks for flagging. I refactored the UT first since they could be inconsistent as I explained in the description, and I didn't want to add the sampling first since that would have made the refactor more annoying. |
kalleep
left a comment
There was a problem hiding this comment.
Nice, since this shared a lot of similarities with stage.sampling it would be useful if the sampling logic was moved to some shared library, WDYT?
In order to keep the PR small and localized I will create a follow up issue for this and add it after we merge this PR 👍 |
Co-authored-by: Clayton Cornell <131809008+clayton-cornell@users.noreply.github.com>
…ocess stage (#5778) <!-- CONTRIBUTORS GUIDE: https://github.com/grafana/alloy/blob/main/docs/developer/contributing.md If this is your first PR or you have not contributed in a while, we recommend taking the time to review the guide. **NOTE** Your PR title must adhere to Conventional Commit style. For details on this, check out the Contributors Guide linked above. --> ### Brief description of Pull Request Based on a comment [here](#5663 (review)), this PR refactors the code used for rate-based sampling and organizes it into an internal shared library. This cuts on code duplication and creates one source of truth for logic. This is currently used by `loki.secretfilter` and `loki.process stage.sampling`. ### Pull Request Details Changes: - internal/sampling: New package with Sampler struct. Implements the same probabilistic sampling as before (Jaeger-style algorithm) using stdlib only. - loki.secretfilter: Replaces local sampling (boundary/source, shouldProcessEntry) with *sampling.Sampler; Validate uses sampling.ValidateRate. - loki.process stage.sampling: Replaces local sampling and the jaeger-client-go dependency with *sampling.Sampler; config validation uses sampling.ValidateRate. Rate remains a required attribute (no default). - Tests: internal/sampling has new unit tests ### Notes to the Reviewer <!-- Add any relevant notes for the reviewers and testers of this PR. --> ### PR Checklist <!-- Remove items that do not apply. For completed items, change [ ] to [x]. --> - [x] Tests updated
Brief description of Pull Request
This PR introduces the argument
ratefor loki.secretfilter to sample which log entries are processed by the secret filter; non-sampled entries are forwarded unchanged.Pull Request Details
rate(optional,float, default1.0): Sampling rate in[0.0, 1.0]. Fraction of entries that are run through detection/redaction; the rest are passed through unmodified.1.0= process all (current behavior).ratemust be in[0.0, 1.0]loki.processsampling stage for consistency.loki_secretfilter_entries_bypassed_total— count of entries forwarded without processing due to sampling.viper), I've refactored the unit tests so that we can test consistently the cases with custom config file versus default config. This was a long tangent, but the gist is that multiple reloads of the config or extensions to the default config were causing inconsistencies in the tests due to gitleaks using global variables to track state.PR Checklist