-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Improve performance for LIKE patterns involving % #16167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7d05135 to
5bd7d4a
Compare
f1c4371 to
90dde8d
Compare
phd3
reviewed
Apr 5, 2023
core/trino-main/src/main/java/io/trino/likematcher/LikeMatcher.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/likematcher/FjsMatcher.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/likematcher/FjsMatcher.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/likematcher/FjsMatcher.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/likematcher/FjsMatcher.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/likematcher/FjsMatcher.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/likematcher/FjsMatcher.java
Outdated
Show resolved
Hide resolved
When the pattern contains only literals and %, use substring search for each of the tokens, via an implementation of the FJS algorithm: https://cgjennings.ca/articles/fjs/ Benchmark results follow: * dynamicXXX measures the end-to-end performance of compiling the matcher and calling it. * matchXXX measures the performance of the match call after the matcher has been compiled * xxxNonOptimized vs xxxOptimized measures the performance when LikeMatcher is constructed with optimize = true/false Benchmark (case) Before After dynamicNonOptimized SHORT_TOKENS_1 3206.181 ± 16.858 ns/op 1301.583 ± 6.762 ns/op dynamicNonOptimized SHORT_TOKENS_2 3534.404 ± 20.939 ns/op 2073.400 ± 17.597 ns/op dynamicNonOptimized SHORT_TOKEN 2568.900 ± 24.562 ns/op 582.184 ± 2.452 ns/op dynamicNonOptimized LONG_TOKENS_1 12055.974 ± 72.518 ns/op 1594.760 ± 8.006 ns/op dynamicNonOptimized LONG_TOKENS_2 17133.678 ± 119.793 ns/op 700.485 ± 3.883 ns/op dynamicNonOptimized LONG_TOKEN_1 7152.323 ± 54.488 ns/op 451.341 ± 2.386 ns/op dynamicNonOptimized LONG_TOKEN_2 2852.432 ± 29.256 ns/op 342.418 ± 3.757 ns/op dynamicNonOptimized LONG_TOKEN_3 5238.197 ± 46.751 ns/op 933.180 ± 5.290 ns/op dynamicNonOptimized SHORT_TOKENS_WITH_LONG_SKIP 3063.792 ± 37.088 ns/op 833.256 ± 26.775 ns/op dynamicOptimized SHORT_TOKENS_1 283428.816 ± 1611.467 ns/op 1305.750 ± 9.497 ns/op dynamicOptimized SHORT_TOKENS_2 10059684.325 ± 44593.208 ns/op 2013.463 ± 15.444 ns/op dynamicOptimized SHORT_TOKEN 81244.561 ± 339.620 ns/op 586.187 ± 2.540 ns/op dynamicOptimized LONG_TOKENS_1 4733209.512 ± 30825.948 ns/op 1603.712 ± 15.636 ns/op dynamicOptimized LONG_TOKENS_2 6875531.823 ± 33728.556 ns/op 707.062 ± 3.214 ns/op dynamicOptimized LONG_TOKEN_1 665877.955 ± 30123.355 ns/op 453.508 ± 2.343 ns/op dynamicOptimized LONG_TOKEN_2 370405.576 ± 2891.106 ns/op 342.558 ± 2.781 ns/op dynamicOptimized LONG_TOKEN_3 402514.307 ± 1920.966 ns/op 932.587 ± 4.264 ns/op dynamicOptimized SHORT_TOKENS_WITH_LONG_SKIP 254232.154 ± 1114.968 ns/op 821.808 ± 4.116 ns/op matchNonOptimized SHORT_TOKENS_1 2833.111 ± 13.485 ns/op 701.785 ± 3.181 ns/op matchNonOptimized SHORT_TOKENS_2 3221.687 ± 20.231 ns/op 543.724 ± 2.822 ns/op matchNonOptimized SHORT_TOKEN 2311.488 ± 11.088 ns/op 458.462 ± 1.643 ns/op matchNonOptimized LONG_TOKENS_1 11778.521 ± 52.387 ns/op 865.535 ± 3.973 ns/op matchNonOptimized LONG_TOKENS_2 16922.399 ± 72.356 ns/op 193.247 ± 0.574 ns/op matchNonOptimized LONG_TOKEN_1 6871.454 ± 35.185 ns/op 259.938 ± 1.161 ns/op matchNonOptimized LONG_TOKEN_2 2517.248 ± 13.335 ns/op 151.030 ± 0.579 ns/op matchNonOptimized LONG_TOKEN_3 5021.075 ± 39.784 ns/op 709.089 ± 3.854 ns/op matchNonOptimized SHORT_TOKENS_WITH_LONG_SKIP 2757.342 ± 16.299 ns/op 504.451 ± 1.964 ns/op matchOptimized SHORT_TOKENS_1 783.268 ± 3.646 ns/op 702.478 ± 3.716 ns/op matchOptimized SHORT_TOKENS_2 1147.895 ± 4.307 ns/op 543.043 ± 2.447 ns/op matchOptimized SHORT_TOKEN 1044.000 ± 4.159 ns/op 458.934 ± 2.049 ns/op matchOptimized LONG_TOKENS_1 1044.809 ± 5.375 ns/op 867.075 ± 4.226 ns/op matchOptimized LONG_TOKENS_2 1062.192 ± 5.323 ns/op 193.253 ± 0.678 ns/op matchOptimized LONG_TOKEN_1 1045.351 ± 4.702 ns/op 259.962 ± 1.199 ns/op matchOptimized LONG_TOKEN_2 1084.966 ± 3.921 ns/op 150.928 ± 0.652 ns/op matchOptimized LONG_TOKEN_3 1061.450 ± 3.678 ns/op 707.735 ± 3.565 ns/op matchOptimized SHORT_TOKENS_WITH_LONG_SKIP 1148.827 ± 8.071 ns/op 504.854 ± 2.521 ns/op
Member
Author
|
@phd3, addressed comments in a fixup commit. |
phd3
approved these changes
May 1, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When the pattern contains only literals and %, use substring
search for each of the tokens, via an implementation of the
FJS algorithm: https://cgjennings.ca/articles/fjs/
Benchmark results follow:
the matcher and calling it.
the matcher has been compiled
when LikeMatcher is constructed with optimize = true/false
Includes commits from #15999. Only the last two commits are new.
Release notes
(x) Release notes are required, with the following suggested text: