Conversation
37e0277 to
981e1b3
Compare
Implementation for a new API Security sampler that aimes at providing
better information as the sampling decision is made per endpoint (method
plus route) and response status code; instead of being made across all.
This implementation is based on a ccache-style bucketed LRU cache that
allows distributing lock contention over several buckets, 16 by default,
which can optionally be scaled to achieve better concurrency.
Here is the result of running the benchmark on my machine with
`1,000,000,000` operations distribted over `concurrency * 10` goroutines
for each configuration:
```
goos: darwin
goarch: arm64
pkg: github.com/DataDog/appsec-internal-go/apisec
cpu: Apple M1 Max
│ 1 │ 1000 │
│ sec/op │ sec/op vs base │
Sampler/keySpaceSize=2048/interval=1s-10 79.74n ± 3% 94.39n ± 0% +18.37% (p=0.000 n=10)
Sampler/keySpaceSize=8192/interval=1s-10 88.58n ± 1% 94.81n ± 0% +7.03% (p=0.000 n=10)
geomean 84.04n 94.60n +12.56%
│ 1 │ 1000 │
│ keep% │ keep% vs base │
Sampler/keySpaceSize=2048/interval=1s-10 16.38m ± 3% 19.41m ± 1% +18.50% (p=0.000 n=10)
Sampler/keySpaceSize=8192/interval=1s-10 780.60m ± 6% 73.97m ± 1% -90.52% (p=0.000 n=10)
geomean 113.1m 37.89m -66.49%
│ 1 │ 1000 │
│ B/op │ B/op vs base │
Sampler/keySpaceSize=2048/interval=1s-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Sampler/keySpaceSize=8192/interval=1s-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
geomean ² +0.00% ²
¹ all samples are equal
² summaries must be >0 to compute geomean
│ 1 │ 1000 │
│ allocs/op │ allocs/op vs base │
Sampler/keySpaceSize=2048/interval=1s-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Sampler/keySpaceSize=8192/interval=1s-10 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
geomean ² +0.00% ²
¹ all samples are equal
² summaries must be >0 to compute geomean
```
981e1b3 to
48f9926
Compare
6354163 to
64eeac0
Compare
Implementation of the new API Security sampler defined by the [latest RFC](https://docs.google.com/document/d/1PYoHms9PPXR8V_5_T5-KXAhoFDKQYA8mTnmS12xkGOE/edit?tab=t.0). This change makes the API Security sampler make decisions specific to a given endpoint (method + route + response status code) instead of using a simplistic sampling rate. This allows for improved coverage and accuracy of schema extraction as part of API Security. This change uses the sampler implementation from github.com/DataDog/appsec-internal-go#39.
| // We're already holding the maximium number of items, so we will rebuild | ||
| // in order to perform an eviction pass. Updates made in the meantime will | ||
| // be lost. | ||
| go m.rebuild(table, threshold) |
There was a problem hiding this comment.
are we sure about this unbounded goroutine creation?
There was a problem hiding this comment.
This can only happen if m.rebuilding.CompareAndSwap(false, true) swaps... so there can be only 1 of these at a time.
| }) | ||
| } | ||
|
|
||
| func BenchmarkSampler(b *testing.B) { |
There was a problem hiding this comment.
can we add it to the benchmarking platform?
There was a problem hiding this comment.
We could... Would have to check how it's done...
There was a problem hiding this comment.
IIRC, you just need to add the benchmark name in the config file at the root of the repo (gitlab file IIRC)
There was a problem hiding this comment.
In dd-trace-go yes.... but in other repo we have to create a whole new pipeline in the multiple benchmarking-platform repository full of shell script...
If you really feel like it there is an attempt at doing this for go-libddwaf as a branch in there
| for { | ||
| var exists bool | ||
| entry, exists = table.FindEntry(key) | ||
| if exists { | ||
| // The entry already exists, so we can proceed... | ||
| break | ||
| } | ||
|
|
||
| // We're adding a new entry to the table, so we need to: | ||
| // 1. Ensure we have capacity (possibly trigger an eviction rebuild) | ||
| // 2. Claim the slot (or look for another slot if it's already claimed) | ||
| newCount := table.count.Add(1) | ||
| if newCount > config.MaxItemCount && m.rebuilding.CompareAndSwap(false, true) { | ||
| // We're already holding the maximium number of items, so we will rebuild | ||
| // in order to perform an eviction pass. Updates made in the meantime will | ||
| // be lost. | ||
| go m.rebuild(table, threshold) | ||
| } | ||
| if newCount > capacity { | ||
| // We don't have space to add any new item, so we'll ignore this and | ||
| // decide to DROP it (we may otherwise cause a surge of inconditional | ||
| // keep decisions, that is not desirable). This only happens in the most | ||
| // dire of circumstances (a table rebuild did not complete fast enough | ||
| // to make up free space). | ||
| table.count.Add(-1) | ||
| return false | ||
| } | ||
|
|
||
| if entry.Key.CompareAndSwap(0, key) { | ||
| // We have successfully claimed the slot, so now we can proceed to set it | ||
| // up. If we fail to swap, another goroutine has sampled this slot just | ||
| // before this one, so we can DROP the sample. | ||
| return entry.Data.CompareAndSwap(0, newEntryData(now, now)) | ||
| } | ||
|
|
||
| if entry.Key.Load() == key { | ||
| // Another goroutine has already used this slot for this key, and very | ||
| // little time has passed since then, so we can DROP this sample... This | ||
| // is extremely unlikely to happen (and nearly impossible to reliably | ||
| // cover in unit tests). | ||
| return false | ||
| } | ||
|
|
||
| // Another goroutine has already used this slot for another key... We | ||
| // will try to find another slot then... | ||
| table.count.Add(-1) | ||
| } |
There was a problem hiding this comment.
This for loop can definitively be made into a "LoadOrStore" sub function or equivalent
There was a problem hiding this comment.
I tried twice and it made things worse from a "following what's going on" standpoint...
| }) | ||
| } | ||
|
|
||
| func BenchmarkSampler(b *testing.B) { |
There was a problem hiding this comment.
In dd-trace-go yes.... but in other repo we have to create a whole new pipeline in the multiple benchmarking-platform repository full of shell script...
If you really feel like it there is an attempt at doing this for go-libddwaf as a branch in there
Co-authored-by: Eliott Bouhana <47679741+eliottness@users.noreply.github.com>
392cd58 to
9ff0f53
Compare
Implementation of the new API Security sampler defined by the [latest RFC](https://docs.google.com/document/d/1PYoHms9PPXR8V_5_T5-KXAhoFDKQYA8mTnmS12xkGOE/edit?tab=t.0). This change makes the API Security sampler make decisions specific to a given endpoint (method + route + response status code) instead of using a simplistic sampling rate. This allows for improved coverage and accuracy of schema extraction as part of API Security. This change uses the sampler implementation from DataDog/appsec-internal-go#39.
Implementation of the new API Security sampler defined by the [latest RFC](https://docs.google.com/document/d/1PYoHms9PPXR8V_5_T5-KXAhoFDKQYA8mTnmS12xkGOE/edit?tab=t.0). This change makes the API Security sampler make decisions specific to a given endpoint (method + route + response status code) instead of using a simplistic sampling rate. This allows for improved coverage and accuracy of schema extraction as part of API Security. This change uses the sampler implementation from DataDog/appsec-internal-go#39.
Implementation for a new API Security sampler that aimes at providing
better information as the sampling decision is made per endpoint (method
plus route) and response status code; instead of being made across all.
This implementation is based on an open addressing hash table using
copy-update-replace semantics, so that it is able to operate completely
lock-less. This is described by RFC-1021.
Here is the result of running the benchmark on my machine with
1,000,000,000operations distributed overconcurrency * 10goroutinesfor each configuration: