-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve table search speed through lookups #112
Improve table search speed through lookups #112
Conversation
With the help of #113 this gives:
Full Report: https://gist.github.com/indutny/9f5f623c52d6e935225edfe09857ef99 |
Rebased over the latest master, but it looks like nothing changed in this PR. |
0b2442c
to
a2f1b04
Compare
src/tables.rs
Outdated
@@ -387,9 +387,81 @@ pub mod grapheme { | |||
} | |||
|
|||
pub fn grapheme_category(c: char) -> (u32, u32, GraphemeCat) { | |||
bsearch_range_value_table(c, grapheme_cat_table) | |||
let idx = c as usize / 0x80; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add/generate comments into the code that explains what's going on here
also on the const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! Force pushed with comments!
f52636c
to
bb6b82a
Compare
Let me run benchmarks on this first, it looks like I might have regressed with my latest refactor. |
Prior to this change table search would have to do a binary search over about 1000 entries which resulted in around 10 memory loads on average. In this commit we reduce the search space by doing a pre-lookup in a generated table to get a smaller (often zero-length) slice of the full sorted range list. On average this gives us just one entry of the range list to perform binary search on, which reduces the average number of memory loads to 2.
bb6b82a
to
b4c9ce1
Compare
Alright, let me know when I should look at this! |
@Manishearth Should be good now, sorry for the back and forth! I made some simplifications and benchmarks looks even better than before: https://gist.github.com/indutny/c8d29a00680bfbaf19d22d02a7175c0d ( with up to 51% improvement on some word bounds benches). |
Thanks! |
Wow! Thank you for merging it! Not to impose, but I was wondering when you planned to release a new version of the library? |
I'd happily merge a PR bumping the library to 1.10.1. |
You got it! |
Wait, I have one more optimization for you. :-) (EDIT: I'm @indutny's coworker, we've been loosely collaborating and independently looking at optimization opportunities in a few of the unicode-rs crates.) |
Ah, it looks like @indutny's optimization has made mine irrelevant—some texts get faster, others get slower. So it's probably not worth it. |
Prior to this change table search would have to do a binary search over about 1000 entries which resulted in around 10 memory loads on average. In this commit we reduce the search space by doing a pre-lookup in a generated table to get a smaller (often zero-length) slice of the full sorted range list. On average this gives us just one entry of the range list to perform binary search on, which reduces the average number of memory loads to 2.