Improve performance of ListResources#23534
Merged
rosstimothy merged 6 commits intomasterfrom Mar 24, 2023
Merged
Conversation
fspmarshall
reviewed
Mar 23, 2023
Contributor
fspmarshall
left a comment
There was a problem hiding this comment.
nit: the old CombineLabels strategy would perform the combination s.t. command labels took precedence over static labels (i.e. if a command label and a static label exist for the same key, only the command label would be observed). As implemented, these GetLabel methods are giving precedence to static labels. I actually prefer this strategy, but it would technically be a breaking change in RBAC behavior, so probably best to change it.
64cf613 to
ec0860c
Compare
strideynet
approved these changes
Mar 24, 2023
Contributor
strideynet
left a comment
There was a problem hiding this comment.
Great work, awesome to see benchmarks being used to ensure improvements are worthwhile.
fspmarshall
approved these changes
Mar 24, 2023
fc4dc2b to
623322a
Compare
BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC.
Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement.
Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page.
Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary.
`GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
623322a to
7f74697
Compare
This was referenced Mar 24, 2023
rosstimothy
added a commit
that referenced
this pull request
Mar 25, 2023
* Add benchmark for ListNodes * Move RBAC logging to trace level BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC. * Intern compiled regular expressions Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement. * Only fetch a single page of resources Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page. * Remove version checking from `services.UnmarshalServer` Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary. * Add `GetLabel` to `types.ResourceWithLables` `GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
rosstimothy
added a commit
that referenced
this pull request
Mar 25, 2023
* Add benchmark for ListNodes * Move RBAC logging to trace level BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC. * Intern compiled regular expressions Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement. * Only fetch a single page of resources Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page. * Remove version checking from `services.UnmarshalServer` Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary. * Add `GetLabel` to `types.ResourceWithLables` `GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
rosstimothy
added a commit
that referenced
this pull request
Mar 25, 2023
* Add benchmark for ListNodes * Move RBAC logging to trace level BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC. * Intern compiled regular expressions Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement. * Only fetch a single page of resources Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page. * Remove version checking from `services.UnmarshalServer` Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary. * Add `GetLabel` to `types.ResourceWithLables` `GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
r0mant
pushed a commit
that referenced
this pull request
Mar 28, 2023
* Add benchmark for ListNodes * Move RBAC logging to trace level BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC. * Intern compiled regular expressions Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement. * Only fetch a single page of resources Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page. * Remove version checking from `services.UnmarshalServer` Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary. * Add `GetLabel` to `types.ResourceWithLables` `GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
nklaassen
pushed a commit
that referenced
this pull request
Mar 28, 2023
* Add benchmark for ListNodes * Move RBAC logging to trace level BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC. * Intern compiled regular expressions Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement. * Only fetch a single page of resources Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page. * Remove version checking from `services.UnmarshalServer` Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary. * Add `GetLabel` to `types.ResourceWithLables` `GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
espadolini
pushed a commit
that referenced
this pull request
Mar 28, 2023
* Add benchmark for ListNodes * Move RBAC logging to trace level BenchmarkListNodes is twice as slow when RBAC logging is enabled. By switching RBAC logging from debug to trace we can eliminate the performance hit while still providing a way for users to opt in to the behavior if they need to debug RBAC. * Intern compiled regular expressions Profiles of the benchmark test revealed that the `regexp.Compile` done within `utils.matchString` was the most cpu and memory intensive portion of the tests. By leveraging a `lru.Cache` to intern the compiled regular expressions we get quite a performance improvement. * Only fetch a single page of resources Increases the request limit prior to loading the resources from the cache so that we load enough items in a single page to determine the start key of the next page. * Remove version checking from `services.UnmarshalServer` Unmarshal directly to a `types.ServerV2` instead of first creating a `types.ResourceHeader` to inspect the version. There is only a single version for `types.ServerV2` making the check unnecessary. * Add `GetLabel` to `types.ResourceWithLables` `GetAllLabels` can be overkill if one simply needs to look up the value for a particular label. It creates a new `map[string]string` and copies all of a resources existing labels. RBAC decisions driven by labels incurred the penalty of the copy each time access was checked. The impact of the copy is much more noticeable when a resource has several labels or really long strings in the key or value. By leveraging `GetLabel` RBAC can avoid copying the labels altogether and simply lookup each label key when required.
rosstimothy
added a commit
that referenced
this pull request
Feb 5, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 5, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 5, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 5, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 6, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 6, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
github-actions Bot
pushed a commit
that referenced
this pull request
Feb 6, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 6, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 6, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
rosstimothy
added a commit
that referenced
this pull request
Feb 7, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 7, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 7, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Feb 7, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
carloscastrojumo
pushed a commit
to carloscastrojumo/teleport
that referenced
this pull request
Feb 19, 2025
The main changes here are: - Using an LRU cache to store compiled regular expressions. - Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches. This was heavily inspired by gravitational#23534 which made similar changes to improve the performance of utils.MatchString. The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith. BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change: benchstat old.txt new.txt goos: darwin goarch: arm64 pkg: github.com/gravitational/teleport/lib/utils cpu: Apple M2 Pro │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ ReplaceRegexp/same_expression-12 29.527µ ± 12% 6.837µ ± 1% -76.84% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 22.94µ ± 1% 25.10µ ± 1% +9.41% (p=0.002 n=10) ReplaceRegexp/no_matches-12 24.692µ ± 3% 2.861µ ± 1% -88.41% (p=0.000 n=10) geomean 25.57µ 7.889µ -69.15% │ old.txt │ new.txt │ │ B/op │ B/op vs base │ ReplaceRegexp/same_expression-12 22071.5 ± 1% 164.0 ± 0% -99.26% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 12.90Ki ± 1% 12.87Ki ± 0% ~ (p=0.165 n=10) ReplaceRegexp/no_matches-12 14257.00 ± 1% 15.00 ± 0% -99.89% (p=0.000 n=10) geomean 15.70Ki 318.8 -98.02% │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ ReplaceRegexp/same_expression-12 58.000 ± 0% 6.000 ± 0% -89.66% (p=0.000 n=10) ReplaceRegexp/unique_expressions-12 55.00 ± 0% 55.00 ± 0% ~ (p=1.000 n=10) ¹ ReplaceRegexp/no_matches-12 58.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) geomean 56.98 ? ² ³ ¹ all samples are equal ² summaries must be >0 to compute geomean ³ ratios must be >0 to compute geomean
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reduces latency of
proto.AuthService/ListResourcesin a variety of ways:*regexp.Regexpin a LRU cache so that they can be reused during RBACGetLabel(key string) (value string, ok bool)totypes.ResourcesWithLabelsto prevent copying the entire label set when we just need to look up keysauth.ServerWithRoles.ListResourcesservices.UnmarshalServerto unmarshal directly into atypes.ServerV2instead of first into atypes.ResourceHeaderto check that the version istypes.V2Comparison of
BenchmarkListNodesfrom b1715a5 to ec0860c: