-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak Span
encoding.
#58458
Tweak Span
encoding.
#58458
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
Local measurements indicate that this is a 3-5% instruction win for @bors try |
Tweak `Span` encoding. Failing to fit `base` is more common than failing to fit `len`.
cc #44646 |
☀️ Test successful - checks-travis |
@rust-timer build 46701e6 |
Success: Queued 46701e6 with parent c67d474, comparison URL. |
Finished benchmarking try commit 46701e6 |
The good results for
Other changes were minor and hard to distinguish from noise. |
This is certainly a tradeoff. |
To put things into perspective. 24 bit base is enough to cover about 15MB of code. 7 bit len is 127 bytes of code
this is a small function of a modest 6 bit len is 63 bytes of code
this is a non-block expression or a pattern, items and larger expressions probably don't fit. |
I'd be really interested to look at distribution of bases/lengths/ctxts during both span decoding and encoding since they should be quite different. |
@nnethercote (The most convenient way to do it is |
I did that (with And you are right that this trade-off favours large crates. I view this as a good thing... compile times of small crates are already low :) Meanwhile large crates, which are slow, pay an extra price for their size; this change mitigates that a little. It's a shame that |
Yes, please. |
Ping from triage @nnethercote :) |
ping from triage @nnethercote Unfortunately we haven't heard from you on this in a while, so I'm closing the PR to keep things tidy. Don't worry though, if you'll have time again in the future please reopen this PR, we'll be happy to review it again! |
Fast paths drop from 96.9% to 78.2%. |
Failing to fit `base` is more common than failing to fit `len`.
b67352a
to
ff94fea
Compare
@bors try |
⌛ Trying commit ff94fea with merge b56778a3f620ca1c5270ab9d9973da37a07b9f62... |
☀️ Try build successful - checks-travis |
@rust-timer build b56778a3f620ca1c5270ab9d9973da37a07b9f62 |
Success: Queued b56778a3f620ca1c5270ab9d9973da37a07b9f62 with parent 428943c, comparison URL. |
Here are some
Here are the same results for
Most lengths are short, with a hump at 3 and 4. Lengths drop off quite a bit going from 6 bits to 7, and a lot more going from 7 bits to 8. Bases are more evenly spread out between zero and the maximum base. Because the above tables shows the number of bits, their entries are biased towards the larger sizes, because each additional bit doubles the span covered. So, the effects of too few bits are quite different for length vs. base. For length, all programs will be affected roughly equally. E.g. dropping from 7 to 6 bits makes things a bit worse, dropping from 6 to 5 bits would be a lot worse, etc. For base, it depends on the crate size; any program that is big enough to greatly exceed the base maximum is going to face a performance cliff. So this PR makes things universally slightly worse for lengths for all programs, but then makes things a lot better for bases for large crates. |
I thought about using another tag bit and then having two compression regimes, perhaps 23base/7len and 26base/4len... but it doesn't seem worth it. |
20.6% require 26 bits... |
Finished benchmarking try commit b56778a3f620ca1c5270ab9d9973da37a07b9f62 |
We tried that in the original PR (with a perf run), it was slightly slower, apparently due to more complex encoding/decoding. @bors r+ |
📌 Commit ff94fea has been approved by |
Future directions, mostly to satisfy curiosity I guess, it's hard to expect significant perf gains:
|
Actually, a good layout for 64-bit spans (if the span size is bumped) would probably be simply struct Span {
base: u32,
len: u16,
ctxt_and_tag: u16,
} which is both machine and compiler friendly, and cover the statistics found in this thread and corner case like #36799 (comment). |
Tweak `Span` encoding. Failing to fit `base` is more common than failing to fit `len`.
☀️ Test successful - checks-travis, status-appveyor |
Failing to fit
base
is more common than failing to fitlen
.