Skip to content

Improve performance of utils.ReplaceRegexp#51872

Merged
rosstimothy merged 1 commit intomasterfrom
tross/regex_replace_cpu
Feb 6, 2025
Merged

Improve performance of utils.ReplaceRegexp#51872
rosstimothy merged 1 commit intomasterfrom
tross/regex_replace_cpu

Conversation

@rosstimothy
Copy link
Copy Markdown
Contributor

@rosstimothy rosstimothy commented Feb 5, 2025

The main changes here are:

  • Using an LRU cache to store compiled regular expressions.
  • Removing stack traces captured by trace.NotFound/trace.Wrap when there are no matches.

This was heavily inspired by #23534 which made similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in this change and prevent regressions in the future. Results from before and after this change:

$ benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

Changelog: Reduce CPU consumption required to map roles between clusters and perform trait to role resolution.

@rosstimothy
Copy link
Copy Markdown
Contributor Author

Flame graph from a CPU profile that inspired this change:

image

@rosstimothy rosstimothy force-pushed the tross/regex_replace_cpu branch from 4cb540f to dd8a903 Compare February 5, 2025 16:39
Comment thread lib/utils/replace.go Outdated
@rosstimothy rosstimothy requested review from espadolini and removed request for tcsc February 6, 2025 12:16
Comment thread lib/utils/replace.go
Comment on lines 108 to 115
index := expr.FindStringIndex(input)
if index == nil {
return "", trace.NotFound("no match found")
// The returned error is intentionally not wrapped to avoid
// capturing stack traces. This method is used by authorization
// logic and the additional overhead of strack trace capturing
// is a performance bottleneck.
return "", ErrReplaceRegexNotFound
}
return expr.ReplaceAllString(input, replaceWith), nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this applying the regexp twice on the same input? Is there a way to use the regexp API to avoid that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing an obvious way, though the ReplaceRegexpWith and RegexpWithConfig APIs seem to come from a previous iteration of improving performance of trait resolution (#19097). I've not investigated fully but I wonder if they are even needed anymore?

Comment thread lib/utils/replace.go Outdated
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
@rosstimothy rosstimothy force-pushed the tross/regex_replace_cpu branch from 385973a to 338b107 Compare February 6, 2025 19:59
@rosstimothy rosstimothy added this pull request to the merge queue Feb 6, 2025
Merged via the queue into master with commit 591a085 Feb 6, 2025
@rosstimothy rosstimothy deleted the tross/regex_replace_cpu branch February 6, 2025 20:55
@public-teleport-github-review-bot
Copy link
Copy Markdown

@rosstimothy See the table below for backport results.

Branch Result
branch/v15 Failed
branch/v16 Failed
branch/v17 Create PR

rosstimothy added a commit that referenced this pull request Feb 6, 2025
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
rosstimothy added a commit that referenced this pull request Feb 6, 2025
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
rosstimothy added a commit that referenced this pull request Feb 7, 2025
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
github-merge-queue Bot pushed a commit that referenced this pull request Feb 7, 2025
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
github-merge-queue Bot pushed a commit that referenced this pull request Feb 7, 2025
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
carloscastrojumo pushed a commit to carloscastrojumo/teleport that referenced this pull request Feb 19, 2025
The main changes here are:

- Using an LRU cache to store compiled regular expressions.
- Removing stack traces captured by trace.NotFound/trace.Wrap
  when there are no matches.

This was heavily inspired by
gravitational#23534 which made
similar changes to improve the performance of utils.MatchString.

The main beneficiaries of this change are services.MapRoles and
services.TraitsToRoles which rely heavily on
utils.ReplaceRegexp, utils.RegexpWithConfig, or ReplaceRegexpWith.

BenchmarkReplaceRegexp was added to validate the improvements in
this change and prevent regressions in the future. Results from
before and after this change:

benchstat old.txt new.txt
goos: darwin
goarch: arm64
pkg: github.com/gravitational/teleport/lib/utils
cpu: Apple M2 Pro
                                    │    old.txt    │               new.txt               │
                                    │    sec/op     │   sec/op     vs base                │
ReplaceRegexp/same_expression-12      29.527µ ± 12%   6.837µ ± 1%  -76.84% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    22.94µ ±  1%   25.10µ ± 1%   +9.41% (p=0.002 n=10)
ReplaceRegexp/no_matches-12           24.692µ ±  3%   2.861µ ± 1%  -88.41% (p=0.000 n=10)
geomean                                25.57µ         7.889µ       -69.15%

                                    │    old.txt    │               new.txt                │
                                    │     B/op      │     B/op      vs base                │
ReplaceRegexp/same_expression-12       22071.5 ± 1%     164.0 ± 0%  -99.26% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    12.90Ki ± 1%   12.87Ki ± 0%        ~ (p=0.165 n=10)
ReplaceRegexp/no_matches-12           14257.00 ± 1%     15.00 ± 0%  -99.89% (p=0.000 n=10)
geomean                                15.70Ki          318.8       -98.02%

                                    │   old.txt   │                 new.txt                 │
                                    │  allocs/op  │ allocs/op   vs base                     │
ReplaceRegexp/same_expression-12      58.000 ± 0%   6.000 ± 0%   -89.66% (p=0.000 n=10)
ReplaceRegexp/unique_expressions-12    55.00 ± 0%   55.00 ± 0%         ~ (p=1.000 n=10) ¹
ReplaceRegexp/no_matches-12            58.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
geomean                                56.98                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants