-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: recognize continuous runs in switch cases #15780
Comments
The compiler's lexer code (and any typical lexer/scanner for that matter) contains a largish switch which might benefit from optimized compilation of switches. At the moment, we go out of our way (manually) to handle common cases separately. This directly affects lexing performance. |
Here's an amazingly readable summary of switch lowering optimizations: https://www.nextmovesoftware.com/technology/SwitchOptimization.pdf. Many of them require jump tables, which would require object file and linker changes, and which might no longer be a desirable strategy anyway. I plan to investigate some of the other strategies for Go 1.8. |
I think if we're going to do jump tables, we should aim for relative PC
addresses rather than absolute PC in the table. Otherwise every entry
requires a relocation in go.o for external linking and it will further
complicate PIE (-buildmode shared) support.
|
Sorry if I was unclear. I am going to investigate the non-jump-table options. |
Also, since nobody has mentioned it yet, #5496 already exists for compiling switches into jump tables. |
Yeah, I think jump tables are fine for very specific cases (very dense
tables with a lot of different targets).
Some C compilers also uses bitmask for dense switches, and it's especially
useful on 64-bit machines without too many branch targets.
|
I poked through a bunch of switch statements. Based on what I saw, the two near term changes I would like to make are using ranges and bitmasks for dense integers that share a branch target. (For example, It's not obvious to me how this should interact with binary search, assuming the goal is to minimize the number of branches. If just ranges were supported, it'd be easy: Assign a weight of one to every range (including singleton ranges) and then pick a pivot value that balances those weights. With bitmasks, however, things are more complicated. Bitmask-able sets of values interact with each other ( I'd love suggestions or references. I'd rather find something simple and cheap that limits the worst case rather than something optimal. |
This is what GCC does: https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/tree-switch-conversion.c?revision=232055&view=markup Might be interesting to read https://www.nextmovesoftware.com/technology/SwitchOptimization.pdf |
Thanks, Ian. The GCC restrictions on bitmasks suggests that I'd do well enough ignoring them for the moment, so that's what I'll do, which makes the path forward very clear. And that paper is actually where I started on this--it's very useful and well-written. Go also has non-integer switches, which themselves are interesting and appear to be less-well-covered territory in the literature--e.g. #15164 and the general question of whether we should use tries of some kind for switches over strings. Yet more reason to keep the integer handling relatively simple. For my own future reference, here are couple of other notes:
|
Hello @josharian, I believe you implemented contiguous range coalescing/recognition in CL https://go-review.googlesource.com/c/go/+/26770 if I am not mistaken i.e. the non-jump table solution and I just encountered this issue while combing through compiler issues such as #5496 Should we be closing this issue, or is there something else that we can do here? |
There are a bunch of further tweaks we might want to do, such as "Consider joining ranges when there is a small gap, and then testing for the gap". But having a grab-bag issue such as this remain open for stuff like this isn't valuable, so yes, I'll close this. We can always open new issues as desired. |
We should compile small to the same code as small2. Instead, we handle each of the cases independently:
Worse, we end up paying a significant cost in the SSA backend juggling all the branches and blocks generated by the frontend.
This sounds theoretical, but I keep encountering large switch statements in functions with bad compile times, and I have seen a bunch of contiguous runs. Here's an excerpted example from cmd/vendor/golang.org/x/arch/x86/x86asm:
Those constants all happen to be next to each other.
The text was updated successfully, but these errors were encountered: