Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STANDALONE_WASM option #9461

Merged
merged 25 commits into from
Sep 20, 2019
Merged

STANDALONE_WASM option #9461

merged 25 commits into from
Sep 20, 2019

Conversation

kripken
Copy link
Member

@kripken kripken commented Sep 19, 2019

This adds an option to build "standalone" wasm files, that is, files that can be run without JavaScript. They idea is that they can be run in either

  • A wasm server runtime like wasmer or wasmtime, where the supported APIs are wasi.
  • A custom wasm embedding that uses wasm as plugins or dynamic libraries, where the supported APIs may include application-specific imports and exports, some subset of wasi, and other stuff (but probably not JS-specific things).

This removes the old EMITTING_JS flag, which was only used to check whether to minify wasm imports and exports. The new STANDALONE_WASM flag also prevents such minification (since we can't assume the JS will be the only thing to run the wasm), but is more general, in that we may still emit JS with that flag, but when we do the JS is only a convenient way to run the code on the Web or in Node.js, as the wasm can also be run standalone.

Note that SIDE_MODULE is an interesting case here: with wasm, a side module is just a wasm file (no JS), so we used to set EMITTING_JS to 0. However, we can't just set STANDALONE_WASM in that case, since the side module may expect to be linked with a main module that is not standalone, that is, that depends on JS (i.e. the side module may call things in the main module which are from JS).

Aside from side modules, though, if the user says -o X.wasm (emit wasm, no JS file) then we do set STANDALONE_WASM, since that is the likely intention (otherwise, without this flag running such a wasm file would be incredibly difficult since it wasn't designed for it!).

The main reason for needing a new flag here is that while we can use many wasi APIs by default, like fd_write, there are some changes that are bad for JS. The main one is Memory handling: it's better to create the Memory early in JS, both to avoid fragmentation issues on 32-bit, and to allow using the Memory by JS while the wasm is still loading (e.g. to set up files). For standalone wasm, though, we can't do that since there is no JS to create it for us, and indeed wasi runtimes expect the memory to be created and not imported. So STANDALONE_WASM basically means, "make an effort to make the wasm as standalone as possible, even that wouldn't be good for JS." Without this flag we do still try to do that, but not when it compromises JS size.

This adds libstandalone_wasm which contains things in C to avoid using JS, like routing exit to wasi's proc_exit. There may be better ways to do some of those things, which I intend to look into as followups, but I think this is a good first step to get the flag landed in a working and tested state, in as small a PR as possible.

This adds testing in the form of running standalone wasm files in wasmer and wasmtime, on Linux. I have some ideas about how to generalize that in a nicer way, but want to leave that for followups.

This updates EMSCRIPTEN_METADATA - we need a new field to know whether a wasm is standalone or not, as the ABI is different.

After this lands I'll update the wasm standalone docs.

@kripken kripken requested review from sbc100 and dschuff September 19, 2019 16:18
@kripken
Copy link
Member Author

kripken commented Sep 19, 2019

I realized this requires EMSCRIPTEN_METADATA changes - we need a new field to know whether a wasm is standalone or not, as the ABI is different. Added now.

@syrusakbary
Copy link

Love this! Thanks for the effort there

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great step forward!

proc_exit: function(code) {
return _exit(code);
},
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I really hope we can move fd_read in there and link it by default in the future.

@@ -152,6 +152,10 @@ var LibraryManager = {
libraries.push('library_glemu.js');
}

if (STANDALONE_WASM) {
libraries.push('library_wasi.js');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this conditional? Aren't all the library functions only included on demand anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's all on demand, but we also include some libraries only based on their flags, like the GL stuff right above this. I think it's nice as it reflects the fact that nothing should be used from those libraries without the flag, the error is clearer that way.

#if STANDALONE_WASM
// In pure wasm mode the memory is created in the wasm (not imported), and
// then exported.
// TODO: do not create a Memory earlier in JS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a TODO we certainly want to fix.

@@ -8296,16 +8296,21 @@ def run(args, expected):
run(['-s', 'TOTAL_MEMORY=32MB', '-s', 'ALLOW_MEMORY_GROWTH=1', '-s', 'BINARYEN=1'], (2 * 1024 * 1024 * 1024 - 65536) // 16384)
run(['-s', 'TOTAL_MEMORY=32MB', '-s', 'ALLOW_MEMORY_GROWTH=1', '-s', 'BINARYEN=1', '-s', 'WASM_MEM_MAX=128MB'], 2048 * 4)

def test_wasm_targets(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok that we seems to have lost of test coverage for fastcomp here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, I missed something here. I thought it was fine to remove this, but actually it hid a possible regression: building in fastcomp without SIDE_MODULE but with -o X.wasm.

That never worked very well anyhow, it was a half-hearted attempt at standalone wasm files. So I am leaning towards showing a clear error in that case that the user should use the upstream wasm backend for standalone wasm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a reasonable fix actually, pushed now.

@@ -3090,7 +3106,8 @@ def add_emscripten_metadata(js_file, wasm_file):
WebAssembly.lebify(global_base) +
WebAssembly.lebify(dynamic_base) +
WebAssembly.lebify(dynamictop_ptr) +
WebAssembly.lebify(tempdouble_ptr)
WebAssembly.lebify(tempdouble_ptr) +
WebAssembly.lebify(int(Settings.STANDALONE_WASM))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just not add this metadata for STANDALONE_WASM? Since the metadata was mostly to support for loading of emscripten-like wasm binaries outside of emscripten? Perhaps we can do that as a followup after consulting with the users of the metadata? Who are the users exactly I'm not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can check with people if it's unnecessary, but we should either land this with the change (which is the safe thing) or block this until we decide whether to add this change or not. I lean towards the former since the latter may take a while.

In general I think this may still be useful for the same people for the same reasons as before - these wasms are as standalone as we can make them, but still may contain e.g. WebGL calls if using OpenGL, etc. So they are not pure wasi in general. Which means it's useful to have metadata about the emscripten ABI, and this flag does affect the ABI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can land this now.. I was just suggesting this as a followup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I remembered that we have promised to only append to the metadata list. So if we land this we probably don't want to remove the field later.

However, as I said earlier, I think we do want this field, so I'm not worried. Are you still concerned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading your posts, were you saying that we should not emit EMSCRIPTEN_METADATA at all for STANDALONE_WASM? (I initially understood you to mean this new field specifically.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not talking about the field, but the entire metadata section. Its seems to be that the point of this section is so that runtimes and extract the information needs from the "emscripten-flavor" wasm file and run it without JS.

My idea is that for PURE_WASM we may not need the section at all in the future. In which case .. this field would be zeor by definition... or we could bump the major version which allows is the change it however we like.

My real long term hope is that the users of this metadata section are in fact that exact same users who want to use PURE_WASM instead and that PURE_WASM don't needs to self-describe in this way, and so we may be able to simply remove this completely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now, thanks.

Yeah, maybe we eventually won't need this metadata at all, if we only use standard APIs. That seems like the very far future though, as we'll need nonstandard APIs to do many things for quite some time, if not forever (e.g. WebGL).

@cggallant
Copy link

Is it somehow possible to have a STANDALONE_WASM module export its table similar to how it exports its memory?

@kripken
Copy link
Member Author

kripken commented Nov 27, 2019

@cggallant Not currently, but we should probably add an option for that, good point! I opened #9907 for discussion.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 27, 2019

I think you can just add -Wl,--export-table to the link time flags.. however, we probably should switch from a whitelist of a blacklist when filtering linker flags .. I think today we block all unknown-to-emscripten linker flags.

@cggallant
Copy link

Oh very nice! That flag worked and exported the table as __indirect_function_table. It's limited to a maximum of 2 elements. Is there a flag to increase that?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 27, 2019 via email

@cggallant
Copy link

You had mentioned a white list? The --growable-table flag emits a warning message and the generated module is still fixed at 2 elements:
shared:WARNING: ignoring unsupported linker flag: --growable-table

I ran the --help flag on the wasm-ld.exe in my emsdk-master\upstream\bin folder and it gave me the same list that you posted above.

@cggallant
Copy link

I was able to find the white list in the emcc.py file's SUPPORTED_LLD_LINKER_FLAGS

When I added --growable-table to the list and re-ran the command line, the generated module no longer has an upper limit for the table size.

With this, I was able to get my code working. Thank you very much for all your help!

belraquib pushed a commit to belraquib/emscripten that referenced this pull request Dec 23, 2020
This adds an option to build "standalone" wasm files, that is, files that can be run without JavaScript. They idea is that they can be run in either

A wasm server runtime like wasmer or wasmtime, where the supported APIs are wasi.
A custom wasm embedding that uses wasm as plugins or dynamic libraries, where the supported APIs may include application-specific imports and exports, some subset of wasi, and other stuff (but probably not JS-specific things).
This removes the old EMITTING_JS flag, which was only used to check whether to minify wasm imports and exports. The new STANDALONE_WASM flag also prevents such minification (since we can't assume the JS will be the only thing to run the wasm), but is more general, in that we may still emit JS with that flag, but when we do the JS is only a convenient way to run the code on the Web or in Node.js, as the wasm can also be run standalone.

Note that SIDE_MODULE is an interesting case here: with wasm, a side module is just a wasm file (no JS), so we used to set EMITTING_JS to 0. However, we can't just set STANDALONE_WASM in that case, since the side module may expect to be linked with a main module that is not standalone, that is, that depends on JS (i.e. the side module may call things in the main module which are from JS).

Aside from side modules, though, if the user says -o X.wasm (emit wasm, no JS file) then we do set STANDALONE_WASM, since that is the likely intention (otherwise, without this flag running such a wasm file would be incredibly difficult since it wasn't designed for it!).

The main reason for needing a new flag here is that while we can use many wasi APIs by default, like fd_write, there are some changes that are bad for JS. The main one is Memory handling: it's better to create the Memory early in JS, both to avoid fragmentation issues on 32-bit, and to allow using the Memory by JS while the wasm is still loading (e.g. to set up files). For standalone wasm, though, we can't do that since there is no JS to create it for us, and indeed wasi runtimes expect the memory to be created and not imported. So STANDALONE_WASM basically means, "make an effort to make the wasm as standalone as possible, even that wouldn't be good for JS." Without this flag we do still try to do that, but not when it compromises JS size.

This adds libstandalone_wasm which contains things in C to avoid using JS, like routing exit to wasi's proc_exit. There may be better ways to do some of those things, which I intend to look into as followups, but I think this is a good first step to get the flag landed in a working and tested state, in as small a PR as possible.

This adds testing in the form of running standalone wasm files in wasmer and wasmtime, on Linux. I have some ideas about how to generalize that in a nicer way, but want to leave that for followups.

This updates EMSCRIPTEN_METADATA - we need a new field to know whether a wasm is standalone or not, as the ABI is different.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants