-
Notifications
You must be signed in to change notification settings - Fork 696
-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block/br return value optimizations #667
Comments
According to previous measurements here, With that distribution, reducing |
IIUC, the 4% figure doesn't include size reductions from comma/ternary. Alon: could you perhaps measure the size reduction from disabling the optimization that produces those? |
@sunfishcode: On the one hand, set_local is a fairly small node, so you might expect even less than that 0.4%. But the change in binary size is actually slightly more than 0.4% (I see 0.5%), since the 4% set_local reduction also enables other optimizations to do more work, as mentioned before. I agree any method that reduces set_local sizes would reduce this modest size improvement. However, even if the code size improvement were 0%, still 4% fewer set_locals means 4% fewer set_locals to bother with when parsing and compiling, which I think is significant. @lukewagner: Sadly there isn't a good way to get rid of asm.js commas and ternaries - they mostly aren't the result of optimizations. By and large, they are directly created in the asm.js backend or in asm2wasm. So we would need to lower them out to investigate what you ask - I can't think of a simple way to do that. But, as a partial answer, the reason that the asm.js backend and asm2wasm directly create commas and ternaries is because they are shorter than alternatives. Hard to measure the precise benefit there, but it's definitely useful. Not just for code size, it also makes compiling to wasm easier, it's more expressive (I think this is a point @rossberg-chromium has made). |
@kripken Currently |
Yes. (But sorry, I don't get the context?) |
The context was the mention of overall binary size. If Also, concerning decode time, it would be good to measure this. |
Do you mean, shrink from 2 to 1 using an opcode table of some sort? Hard to speculate how that would affect things, as it would be a big change across the board to everything. It could amplify or shrink the effects here. |
@kripken Expressions demand some infrastructure, so there is a runtime complexity. Returning values from blocks is pushing the 'expression pattern' and adding complexity to do so. As you note often multiple values are passed around, and the 'expression pattern' can be pushed a little further with multiple value support. Adding Let me try to explain how I see an expressionless encoding as compelling. Consider if the current wasm expressions implicitly used local variables to store expression temporary values rather than a separate stack. The encoding efficiency has not changed, but now there is the option to re-use these values. Next examine each operator, and note that it is now reading and writing to local variables, but with some implicit I believe an expressionless encoding would have many advantages in other areas of wasm, such as associating debug and meta info with code positions. Whatever the encoding wasm chooses to use, for the webll I will be firstly canonicalizing it to an expressionless encoding, and then rebuilding expressions for presentation and the meta information will reference the position in the canonical expressionless encoding. |
@JSStats: Hard for me to evaluate that approach without a simple concrete example showing how it would work, and how it differs from the current approach. Perhaps in an issue dedicated just to that? |
This was mostly presenting data and discussion, closing. |
There was some discussion about block and br return values, and in particular how much benefit we can get from them, as both our current compiler options (asm2wasm and wasm-backend) don't use them or barely use them (asm2wasm uses block return values in rare cases, for asm.js sequences/commas, but never uses br values). I investigated this topic in the binaryen optimizer (
block-returns
branch; note code is not cleaned up yet).My approach is to process control flow and see where it joins up, at the end of an if-else or a block, and then see if there is a set_local that flows through all those arriving branches (this would occur e.g. if an optimizing compiler emitted a phi node there). If so, we can use block and br return values to pass the result, instead of set_locals, and put a single set_local on the outside of the block or if. In other words, it unifies the set_locals, like this:
This removes set_locals, and can then allow further optimizations, e.g. the new set_local is now on the outside where it might be sunk into a later get_local, etc., and other stuff.
On BananaBread, this removes around 4% of set_locals. (This was better than I expected, so I ran it through a bunch of tests to verify it's working properly.)
Note that this is an underestimate of the opportunity here, as
Overall, then, it seems like block/br return value optimizations can be pretty useful.
I also did some measurements for the topic of multiple return values. I can't optimize them in the current AST, so I don't have full and tested data. But I counted how many locations have a set_local that can be optimized, and how many there are at each - when there is more than one, multiple return values might be employed. Looks like 15.4% of potential locations have at least one set_local that can be optimized, and they have an average of 1.12 per location. In other words, it's common to have a single optimizable set_local, and while sometimes there is more than one, that is rare.
All that has the caveats 1-3 from before, and in addition that as mentioned it's just data and not a full and tested optimization pass, so it would be good to see more investigation here. But this data suggests that a single return value from ifs, blocks, and brs is enough for almost all the benefit here.
The text was updated successfully, but these errors were encountered: