Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid emitting a block in the binary format when it has no name #4912

Merged
merged 8 commits into from
Aug 18, 2022
Merged

Conversation

kripken
Copy link
Member

@kripken kripken commented Aug 17, 2022

We already did this if the block was a child of a control flow structure, which is
the common case (see the new added comment around that code, which clarifies
why). This does the same for all other blocks. This is simple to do and a minor
optimization, but the main benefit from this is just to make our handling of blocks
uniform: after this, we never emit a block with no name. This will make 1a non-
nullable locals easier to handle (since they will be able to assume that property;
and not emitting such blocks avoids some work to handle non-nullable locals
in them).

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to keep visitPossibleBlockContents and visitBlock separate? The only difference here is sometimes we want to be more aggressive (use BranchSeeker) and sometimes we don't. Can we have a single function with a parameter? Probably we can't modify the signature of PostWalker::visitBlock, but maybe we can add a seekBranch parameter to visitPossibleBlockContents and make visitBlock just call it with the parameter false?

src/wasm-stack.h Outdated
Comment on lines 211 to 214
// uses, it is equivalent to not having one). This is potentially quadratic,
// but it is extremely rare to have recursion on this function, since it is
// limited by the number of non-block control flow structures (the places that
// call here).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially quadratic,

Isn't this always quadratic, because we traverse all children?

it is extremely rare to have recursion on this function, since it is limited by the number of non-block control flow structures (the places that call here).

How do we have a recursion here? Can BranchSeeker::has call this BinaryenIRWriter::VisitPossibleBlockContents back or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good points, my text was not accurate. Yes, it's always quadratic, just on smaller numbers. By "recursion" I really meant larger numbers (where quadratic time gets bad). I improved the comment now.

@kripken
Copy link
Member Author

kripken commented Aug 17, 2022

Can we have a single function with a parameter?

Interesting, maybe... but thinking about it, I'm not sure it's worth it. visitBlock and visitPossibleBlockContents already had some code duplication, because both iterate on the children, but in slightly different ways. And while we could share more code now after this PR, it's really just one line - to check if the block has a name. Adding a parameter seekBranch whether to scan the contents would already make it about the same size, I think, and not necessarily more readable.

Comment on lines +211 to +214
// uses, it is equivalent to not having one). Scanning the children of the
// block means that this takes quadratic time, but it will be N^2 for a fairly
// small N since the number of nested non-block control flow structures tends
// to be very reasonable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure... Doesn't BranchSeeker::has mostly only check branches, or more precisely, name uses defined by DELEGATE_FIELD_SCOPE_NAME_USE? (Currently br, br_table, br_on, rethrow, and try(due to delegate) have this field)

but it will be N^2 for a fairly small N since the number of nested non-block control flow structures tends to be very reasonable.

I'm not sure what do you mean by "non-block control flow structures". Do you mean branches here?

Also, it is N^2 because we need to check every child, and that N includes all children. We don't know whether it is a branch or not before checking it. So I'm also not sure what you mean by "a fairly small N".

I'm not suggesting the quadratic behavior is bad; it is preexisting anyway and it is for a reason and as you said it doesn't cause unreasonable slowdown. I'm just not very good at understanding the comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-block control flow structures are: If, Try, Loop. In each of their arms they call visitPossibleBlockContents (to try to avoid emitting a block there).

By small N I mean that there are few nested non-block control flow structures. This would be bad:

(if
  ..
  (if
    ..
    (if
      ..N such ifs..
      ..
        ..innermost child..

Those nested ifs lead to the innermost child being scanned N times where N is the number of those ifs (since each if scans all children). So the total time is O(N*M) if M is the total number of children of the first if. But, N is small in the real world, such if stacks are very rare (unlike block stacks).

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, maybe... but thinking about it, I'm not sure it's worth it. visitBlock and visitPossibleBlockContents already had some code duplication, because both iterate on the children, but in slightly different ways. And while we could share more code now after this PR, it's really just one line - to check if the block has a name. Adding a parameter seekBranch whether to scan the contents would already make it about the same size, I think, and not necessarily more readable.

Hmm, yeah, I thought they are similar but as you said visitBlock has additional routines handling deeply nested blocks... Nevermind then.

Comment on lines +212 to +214
// block means that this takes quadratic time, but it will be N^2 for a fairly
// small N since the number of nested non-block control flow structures tends
// to be very reasonable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still make this cheaper and also improve the later case by running a pass to remove unused names in linear time, then assuming we've already taken care of it here. We already have a pass that does this, don't we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, RemoveUnusedNames is run when optimizing. So this only matters for the unoptimized case. But it's still nice to handle that since that includes a simple roundtrip with wasm-opt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since that includes a simple roundtrip with wasm-opt

I don't understand what you mean here. I see that this quadratic checking will only produce useful results if we haven't already optimized, but it would still be cheaper and produce better results to run the RemoveUnusedNames pass unconditionally here no matter what other optimizations we are doing.

Copy link
Member Author

@kripken kripken Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, running the pass might be fine. We'd need to do some more work to be careful to avoid noticeable side effects, as right now binary writing does not modify the Module data, but running that pass would. But I think it could be done probably.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at RemoveUnusedBrs, it looks like it does more optimizations than we would actually want for this use case. What Module data are you thinking about, though? That pass is function parallel, so I don't think it should change any Module-level state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant any changes to the Module or its contents. Right now, writing a module does not modify the input in any way, but this would make changes to blocks etc. Still, it might be ok...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Yeah, I think that's ok since there shouldn't really be anything happening after writing. This PR lgtm if you'd rather look at that as a follow-up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's leave that as a separate topic. It's the existing behavior before this PR and I'd rather not change it here so each PR is focused.

@kripken kripken merged commit 613fadc into main Aug 18, 2022
@kripken kripken deleted the noblock branch August 18, 2022 16:10
kripken added a commit that referenced this pull request Aug 22, 2022
Previously the wat parser would turn this input:

(block
  (nop)
)

into something like this:

(block $block17
  (nop)
)

It just added a name all the time, in case the block is referred to by an index
later even though it doesn't have a name.

This PR makes us rountrip more precisely by not adding such names: if there
was no name before, and there is no break by index, then do not add a name.

In addition, this will be useful for non-nullable locals since whether a block has
a name or not matters there. Like #4912, this makes us more regular in our
usage of block names.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants