Skip to content

Latest commit

 

History

History
366 lines (214 loc) · 17.9 KB

CG-07-07.md

File metadata and controls

366 lines (214 loc) · 17.9 KB

WebAssembly logo

Agenda for the July 7th video call of WebAssembly's Community Group

  • Where: zoom.us
  • When: July 7th, 4pm-5pm UTC (July 7th, 9am-10am Pacific Daylight Time)
  • Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

Agenda items

  1. Opening, welcome and roll call
    1. Opening of the meeting
    2. Introduction of attendees
  2. Find volunteers for note taking (acting chair to volunteer)
  3. Adoption of the agenda
  4. Proposals and discussions
    1. Review of action items from prior meeting.
    2. Vote on explicit scheduling requirements for online votes (Thomas Lively) [5 minutes]
    3. Discussion of unmodeled side effects and intrinsic imports (Thomas Lively) [15 minutes]
    4. Discussion: What are the requirements for toolchain support?
      1. If a toolchain doesn't use/implement all instructions, does it support the proposal?
    5. Discussion of identity of references in wasm vs JS (Ross Tate) [15-20 minutes]
      • Prior offline discussion in linked issue encouraged as preparation
  5. Closure

Agenda items for future meetings

None

Schedule constraints

None

Meeting Notes

Opening, welcome and roll call

Opening of the meeting

Introduction of attendees

  • Alex Syrotenko
  • Alon Zakai
  • Andrew Brown
  • Arun Purushan
  • Ben Smith
  • Conrad Watt
  • Dan Gohman
  • Daniel Hillerström
  • David Piepgrass
  • Derek Schuff
  • Emanuel Ziegler
  • Flaki
  • Francis McCabe
  • Gergely
  • Henjin Ahn
  • Ioanna Dimitriou
  • Jacob Mischka
  • Jakob Kummerow
  • Jay Phelps
  • JS Sugarbroad
  • Keith Miller
  • Luke Imhoff
  • Luke Wagner
  • Manos Koukoutos
  • mkawalec
  • Nick Fitzgerald
  • Paolo Severini
  • Paul Dworzanski
  • Richard Winterton
  • Rick
  • Ross Tate
  • Sabine
  • Sam Clegg
  • Sergey Rubanov
  • Syvatoslav Kuzmich
  • TatWai Chong
  • Thibault Charbonnier
  • Thomas Lively
  • Yury Delendik
  • Zalim
  • Zhi An Ng

Find volunteers for note taking (acting chair to volunteer)

Adoption of the agenda

Thomas Lively seconds

Proposals and discussions

Review of action items from prior meeting.

Vote on explicit scheduling requirements for online votes (Thomas Lively) [5 minutes]

Poll: Should we accept the PR above?

DS: is 24 hours enough time?

TL: good qns, has been our policy for a while

DS: are we expecting all interested people to check the day before meeting? You want people to have enough time to read and form opinion, in that case is 24 hour enough?

TL: enough time to note that there is a vote, but not enough to read up

JS: It's at least 24 hours, people can be suggested to put agenda items earlier. And the chair can push things back, if necessary. We have a lot of processes that are potentially available. I'm ambivalent here. 24 hours is a minimum, I wouldn't want to make us wait for longer for non-contentious items.

DS: if people are happy with this, i will be okay with going ahead and saying yes

LI: are most people using GitHub’s watch feature, or are they checking issue manually

DS: watch as well, but watching so many repos, easy to miss

JS: someone can add agenda two weeks ago, won’t be able to remember, so should refresh

BS: due to timebox, we have to cut this short, take the vote?

JS: consensus vote?

BS: fine to have normal vote here

SF: 4 F: 17 N: 6 A: 0 SA: 0

Discussion of unmodeled side effects and intrinsic imports (Thomas Lively) [15 minutes]

Slides

RT: one item to add to the topic, the tracing application isn’t well served by using a call, you want an instruction to be put there in the spot, sort of like an inline call

TL: If you had compile-time imports in some matter then you could inline them.

JS: wondering what unmodelled side effect await has, from the perspective of the abstract Wasm machine, await doesn’t do anything, it stops time, waits for it to come back. Supposed to have a guarantee about the embedder world, but nothing about Wasm itself.

TL: Await is interesting because the host code can make further calls back into the wasm VM when it was waiting.

JS: we could have a version of await that doesn’t have this problem, similar to atomics where you pause the world while waiting.

TL: Even in that more restricted await case, where you can't re-enter, it's more like a no-op. Except that you can't re-order with respect to other instructions.

RT: want to think about program equivalence, a nop is something that can be removed entirely, and still the same program.

JS: the ITT thing might actually be from an abstract perspective, a nop. A valid implementation can ignore the ITT part.

RT: not my impression of that. Having no effect on module internal state is not the same as a nop.

JS: people are asking for an instruction, that doesn’t do anything on a random implementation, it has to be treated differently from a nop in an optimizer

TL: the discussion now is along the lines of, taking case by case, for each proposal, see how to fit those into the spec, and not making a broad claim that those are out of scope and willbe solved by a new mechanism that we do not yet have. Ideally, CG figure out what the best path forward for all of them as a group, so we don’t need to repeat discussion and come up with divergent design.

BS: The danger i’m seeing here, the await proposal has been dropped, the debugger is a “fake” proposal, ITT is the real one. Concerned that any plan come up with here will be anchoring too much on non-real examples. More tempted to have another real example (phase 1 proposal) before we come up with a plan as a CG

RT: high level item that will be useful to gauge, what is the interest in having compile time intrinsics. Will also solve problems related to composing modules.

DG: Compile time intrinsics would solve some problems… [missed the rest here] (they could be instructions instead)

RT: you still have to have intrinsics standardized, and intrinsics can be implemented while these compile time

TL: still an option, although not a fan

DG: The motivation for making them non-instructions is not clear to me.

TL: on the await discussion, laid out reasoning: if the instruction, semantically, has unmodelled side effects, the best name for that instruction from core wasm POV is “undefined”. Don’t think core Wasm benefits from having an “undefined” instruction, except a host call import which has undefined semantics.

DG: Is there an assumption that core spec should not have non-normative wording?

TL: core spec document, core spec math model described in document, so far it has been the same. Doesn’t include anything except the core spec.

DG: exception to that: alignment hints

TL: The alignment hint… they don't have any semantics, that's true.

RT: but they do have semantic effects, it’s just not visible anywhere inside Wasm

DG: foreach of these features, there is a normative and non-normative component. The former is for all VMs, recognize these instructions; non-normative will say what this thing is for. Can we put non-normative in the core spec, lots of specs have done this.

BS: timebox-ed, sorry to cut off

TL: discussion on github, please chime in.

Discussion: What are the requirements for toolchain support?

If a toolchain doesn't use/implement all instructions, does it support the proposal?

BS: who brought this up?

TL: i did

JS: basically paired with the previous discussion: if we do add intrinsics, what will the toolchain changes look like. Also related to last meeting on bulk memory

TL: not about unmodelled side effect, about in general, if we have a proposal and toolchain only uses/implements part of the proposal. How did we end up in that place? Should we standardize the current state of the proposal including unused instructions? For bulk memory, we decided it was low risk, so included everything. In the future we want to be more careful, make sure everything in the proposal will be used by toolchain. Hopefully we don’t get into that situation again.

HA: implementing it and using it is different. E.g. for reference types we can quickly implement the rest. The bigger problem is we didn’t have usage for it

TL: we being emscripten.

HA: there were some instructions that even the rust toolchain didn’t have usage for.

RT: the meta problem is, how we figure out how things are being used before they advance this far

NF: my understanding is that the instruction question is table.fill, and the reason why we added it was for symmetry with memory.fill. Nice in principle, but no one has a usecase, we can delay these nice symmetry things, until someone says we have a use case

BS: that’s certainly been the case with instructions in SIMD proposal, they have been removed due to lack of use case or too slow

AK: downside is long lag time between something being useful and it being implemented and available in all engines

JS: that rule is there to make sure we have shaken the tree to make sure that we have gotten enough implementation. We can enhance the requirements to have one producer to implement that instruction, so that there is enough infra in place for someone to use the instruction if they want to. This is not that different from TC39 (?) requiring two major implementers to implement something before approval for standard.

DS: We have a such requirement already, so the question is: does the toolchain requirement mean that they have to use every instruction. If there is a subset of the proposal not being used, does that count?

JS: good point, these things are brought to CG for a vote, enough though toolchain has incomplete support, we want to encourage champions to not do it

AC: not so much about incomplete support, for rust toolchain it doesn’t make sense, everything else makes sense. It’s complete support, but proposal doesn’t need everything

TL: And part of the issue is that the toolchain requirement doesn't come up in the proposal process until too late.

RT: the purpose is to shake the tree and see things are correct. If you don’t have producer with that code, it’s hard to evaluate that. Needs to have some producer involved to have a thorough evaluation of these things.

BS: what we want here is someone (not in this meeting) think about how to change the process to have toolchain requirements earlier. We are sort of doing this already, need to codify this. For the things we are currently proposed, is there a toolchain doing it.

TL: discussion in proposal repo sounds good

LI: does llvm need to implement something, before toolchain can use it?

TL: No, for example, the wasm-bindgen stuff is all post-link post-LLVM transformations. So when they add reference type stuff in their binaries, it doesn't have to use LLVM.

SC: also true for emscripten, able to use features that llvm doesn’t support, through binaryen

LI: Is the requirement that -- it's not that the toolchain has it, but that it can be used.

TL: maybe not every component of the toolchain has it, but some part of it has it

AC: depends on hwo to define toolchain. If it means LLVM then no

BS: toolchain supposed to be a chain, wasm-bindgen is part of the chain, though we don’t normally think of it that way

AK (chat): I think it's likely this is going to vary a lot between proposals, I don't know if we're going to get much better than case-by-case examination. But agree that maybe having something for phase 3 about "evaluating how this will be used by toolchains" (don't know how to pin that down)

BS: probably need wordsmithing, then bring it back to discussion

TL: SG, I can start discussion on proposal repo.

Discussion of identity of references in wasm vs JS (Ross Tate) [15-20 minutes]

Prior offline discussion in linked issue encouraged as preparation

Ross Tate presenting Reference Identity in wasm vs. JS

JS: I don't understand how requiring this in JS prevents allocation. It seems like an embedder can do it if they want to.

RT: this is the advantage of it

JS: doing this does this but doesn’t justify why it is required

RT: When I asked this, the reason I got was that it prevents an allocation.

TL: on the web, the js engine all needs to all do it , or not do it, otherwise there is an observable difference

RT: Right, so we have to agree on some semantics. So the alternative semantics is that you always allocate whenever you ask for a new funcref in JS.

KM (chat): How does this work with tables? Is it a new identity every time you pull it out in JS?

RT: Keith asks about tables… it's related, but I'll get to that in a second.

JS: I now have a corpus of 3000 modules, if someone wants to send me some code that can go through that please talk to me.

KM (chat): The one downside of this is you’d have the nested constructor problem that JS has if you ever have a wasm constructor. Each time you get a fresh function you’d have a distinct prototype so all your inline caches would blow up

RT: when you get a funcref and turn it to JS, you have a funcref prototype, everyone will be using it, does that solve your problem?

KM (chat): It’s the allocated object’s from the constructor that’s the problem You’d have a fresh funcRef.prototype object

SC: Did you talk about tables? Emscripten does this, it checks tables to see whether the table values don't change over time.

RT: you’ll get a new funcref everything, emscripten is doing this?

AZ: For dynamic linking, yes, we do this all the time.

SC: posted a bug where someone found that the table is changed underneath (emscripten-core/emscripten#11458)

AK: It might exist in sites that are already deployed, right?

AZ: we depend on this for dynamic linking, we depend on this, we can change this but will require a different model for dynamic linking, which we are thinking about anyway.

AK: changing for future users, not for existing users.

TL: Right, we wouldn't want to break existing users that are already using dynamic linking.

RT: currently funcrefs happen very infrequently, but new ref type proposal adds funcref as a more first-class type, dependencies are more likely to build in the following months than now. Right now is just table.get, understanding this emscripten thing will be useful

AZ: things like combining equivalent functions and getting rid of functions in the middle, we do those optimizations at the toolchain level, it is true that VM can do it, but in Wasm we depend on the toolchain to do this

JS: wondering if (thinking in terms of implementing equality and want these optimizations), i want some runtime identity in function prologue, if want to reuse, create a thunk that has the same identity and another code. Does that work?

RT: It means that function references have more content through them...

JS: or the thing they point to

RT: It would be a code pointer with a header, with identity information.

JS: now i’m paying for a load everytime

RT: alon’s feedback is useful. The toolchain doesn’t respect program equivalence, at least on the JS. The toolchain has the behavior that it can transform wasm module such that the JS is not equivalent

AZ: If we know in the toolchain that it's OK to do, then we do it. In practice in emscripten and binaryen we already do this.

HA: related questions, we have a lot of optimization passes that can remove a function, like merging, function inlining, even without those optimizations, every single class of toolchain can optimize a function, semantics is preserved by it’s not the same function, instruction in the function is different. In that respect, every single optimization will not preserve the identity of the JS function.

RT: the toolchain does not preserve this JS identity

AZ: in a typical Wasm program, it is not observable, if you don’t use dynamic linking, different exports, we can merge duplicates.

RT: How are you telling that it's not observable?

AZ: if you have two functions inside Wasm not in the table we can merge them.

RT: You're doing an analysis that they're never exported.

AZ: that changes behavior, we know runtime won’t care about it

RT: In JS you can change semantics

TL: we control the JS too

HA: even without merging functions, optimization passes can change function bodies.

RT: internals of function can change all you want, but the identity of function is what you leak

HA: function in the table, we can still optimize function body, the same slot in the table has the optimized function, most toolchain opts don’t preserve identity

RT: issue is resolved because toolchain does not respect identity

TL: we do change the observable behavior from JS, it’s a small thing, and we also control the JS, so it’s not a problem so far

HA: different thing from TL, unobservable semantics. For example, two functions do the same thing, one optimized and the other not. In that case, are they equal? Guess not.

TL: no. two different functions, merging into one, observable from JS.

HA: not talking about merging.

RT: just talking about changing the function body? That’s totally fine.

RT: there is interest, i should work with JP for research

AK: i think we know there is a problem, should work with Emscripten folks

BS: we are trying timeboxing, follow up with me or on github issue (#592) to see if this is valuable, whether we should continue or not

Closure