Skip to content

Commit

Permalink
STANDALONE_WASM option (emscripten-core#9461)
Browse files Browse the repository at this point in the history
This adds an option to build "standalone" wasm files, that is, files that can be run without JavaScript. They idea is that they can be run in either

A wasm server runtime like wasmer or wasmtime, where the supported APIs are wasi.
A custom wasm embedding that uses wasm as plugins or dynamic libraries, where the supported APIs may include application-specific imports and exports, some subset of wasi, and other stuff (but probably not JS-specific things).
This removes the old EMITTING_JS flag, which was only used to check whether to minify wasm imports and exports. The new STANDALONE_WASM flag also prevents such minification (since we can't assume the JS will be the only thing to run the wasm), but is more general, in that we may still emit JS with that flag, but when we do the JS is only a convenient way to run the code on the Web or in Node.js, as the wasm can also be run standalone.

Note that SIDE_MODULE is an interesting case here: with wasm, a side module is just a wasm file (no JS), so we used to set EMITTING_JS to 0. However, we can't just set STANDALONE_WASM in that case, since the side module may expect to be linked with a main module that is not standalone, that is, that depends on JS (i.e. the side module may call things in the main module which are from JS).

Aside from side modules, though, if the user says -o X.wasm (emit wasm, no JS file) then we do set STANDALONE_WASM, since that is the likely intention (otherwise, without this flag running such a wasm file would be incredibly difficult since it wasn't designed for it!).

The main reason for needing a new flag here is that while we can use many wasi APIs by default, like fd_write, there are some changes that are bad for JS. The main one is Memory handling: it's better to create the Memory early in JS, both to avoid fragmentation issues on 32-bit, and to allow using the Memory by JS while the wasm is still loading (e.g. to set up files). For standalone wasm, though, we can't do that since there is no JS to create it for us, and indeed wasi runtimes expect the memory to be created and not imported. So STANDALONE_WASM basically means, "make an effort to make the wasm as standalone as possible, even that wouldn't be good for JS." Without this flag we do still try to do that, but not when it compromises JS size.

This adds libstandalone_wasm which contains things in C to avoid using JS, like routing exit to wasi's proc_exit. There may be better ways to do some of those things, which I intend to look into as followups, but I think this is a good first step to get the flag landed in a working and tested state, in as small a PR as possible.

This adds testing in the form of running standalone wasm files in wasmer and wasmtime, on Linux. I have some ideas about how to generalize that in a nicer way, but want to leave that for followups.

This updates EMSCRIPTEN_METADATA - we need a new field to know whether a wasm is standalone or not, as the ABI is different.
  • Loading branch information
kripken authored and belraquib committed Dec 23, 2020
1 parent a36a42a commit 5d39e70
Show file tree
Hide file tree
Showing 13 changed files with 232 additions and 26 deletions.
10 changes: 10 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,16 @@ jobs:
apt-get update -q
apt-get install -q -y python3 cmake
- checkout
- run:
name: get wasmer
command: |
curl https://get.wasmer.io -sSfL | sh
- run:
name: get wasmtime
command: |
wget https://github.com/CraneStation/wasmtime/releases/download/dev/wasmtime-dev-x86_64-linux.tar.xz
tar -xf wasmtime-dev-x86_64-linux.tar.xz
cp wasmtime-dev-x86_64-linux/wasmtime ~/
- build-upstream
test-upstream-wasm0:
executor: bionic
Expand Down
4 changes: 4 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ Current Trunk
- Module.abort is no longer exported by default. It can be exported in the normal
way using `EXTRA_EXPORTED_RUNTIME_METHODS`, and as with other such changes in
the past, forgetting to export it with show a clear error in `ASSERTIONS` mode.
- Remove `EMITTING_JS` flag, and replace it with `STANDALONE_WASM`. That flag indicates
that we want the wasm to be as standalone as possible. We may still emit JS in
that case, but the JS would just be a convenient way to run the wasm on the Web
or in Node.js.

v.1.38.44: 09/11/2019
---------------------
Expand Down
25 changes: 22 additions & 3 deletions emcc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1697,9 +1697,14 @@ def check_human_readable_list(items):
if use_source_map(options):
exit_with_error('wasm2js does not support source maps yet (debug in wasm for now)')

# wasm outputs are only possible with a side wasm
if target.endswith(WASM_ENDINGS):
shared.Settings.EMITTING_JS = 0
# if the output is just a wasm file, it will normally be a standalone one,
# as there is no JS. an exception are side modules, as we can't tell at
# compile time whether JS will be involved or not - the main module may
# have JS, and the side module is expected to link against that.
# we also do not support standalone mode in fastcomp.
if shared.Settings.WASM_BACKEND and not shared.Settings.SIDE_MODULE:
shared.Settings.STANDALONE_WASM = 1
js_target = misc_temp_files.get(suffix='.js').name

if shared.Settings.EVAL_CTORS:
Expand Down Expand Up @@ -1764,6 +1769,19 @@ def check_human_readable_list(items):
if shared.Settings.MINIMAL_RUNTIME and not shared.Settings.WASM:
options.separate_asm = True

if shared.Settings.STANDALONE_WASM:
if not shared.Settings.WASM_BACKEND:
exit_with_error('STANDALONE_WASM is only available in the upstream wasm backend path')
if shared.Settings.USE_PTHREADS:
exit_with_error('STANDALONE_WASM does not support pthreads yet')
if shared.Settings.SIMD:
exit_with_error('STANDALONE_WASM does not support simd yet')
if shared.Settings.ALLOW_MEMORY_GROWTH:
exit_with_error('STANDALONE_WASM does not support memory growth yet')
# the wasm must be runnable without the JS, so there cannot be anything that
# requires JS legalization
shared.Settings.LEGALIZE_JS_FFI = 0

if shared.Settings.WASM_BACKEND:
if shared.Settings.SIMD:
newargs.append('-msimd128')
Expand Down Expand Up @@ -3012,7 +3030,8 @@ def do_binaryen(target, asm_target, options, memfile, wasm_binary_target,
wasm_file=wasm_binary_target,
expensive_optimizations=will_metadce(options),
minify_whitespace=optimizer.minify_whitespace,
debug_info=intermediate_debug_info)
debug_info=intermediate_debug_info,
emitting_js=not target.endswith(WASM_ENDINGS))
save_intermediate_with_wasm('postclean', wasm_binary_target)

def run_closure_compiler(final):
Expand Down
2 changes: 2 additions & 0 deletions emscripten.py
Original file line number Diff line number Diff line change
Expand Up @@ -2321,6 +2321,8 @@ def debug_copy(src, dst):
cmd.append('--global-base=%s' % shared.Settings.GLOBAL_BASE)
if shared.Settings.SAFE_STACK:
cmd.append('--check-stack-overflow')
if shared.Settings.STANDALONE_WASM:
cmd.append('--standalone-wasm')
shared.print_compiler_stage(cmd)
stdout = shared.check_call(cmd, stdout=subprocess.PIPE).stdout
if write_source_map:
Expand Down
2 changes: 1 addition & 1 deletion site/source/docs/tools_reference/emcc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -443,7 +443,7 @@ Options that are modified or new in *emcc* are listed below:
- <name> **.html** : HTML + separate JavaScript file (**<name>.js**; + separate **<name>.wasm** file if emitting WebAssembly).
- <name> **.bc** : LLVM bitcode.
- <name> **.o** : LLVM bitcode (same as .bc), unless in `WASM_OBJECT_FILES` mode, in which case it will contain a WebAssembly object.
- <name> **.wasm** : WebAssembly without JavaScript support code ("standalone wasm").
- <name> **.wasm** : WebAssembly without JavaScript support code ("standalone wasm"; this enables ``STANDALONE_WASM``).

.. note:: If ``--memory-init-file`` is used, a **.mem** file will be created in addition to the generated **.js** and/or **.html** file.

Expand Down
14 changes: 14 additions & 0 deletions src/library_wasi.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
/*
* Copyright 2019 The Emscripten Authors. All rights reserved.
* Emscripten is available under two separate licenses, the MIT license and the
* University of Illinois/NCSA Open Source License. Both these licenses can be
* found in the LICENSE file.
*/

mergeInto(LibraryManager.library, {
proc_exit__deps: ['exit'],
proc_exit: function(code) {
return _exit(code);
},
});

4 changes: 4 additions & 0 deletions src/modules.js
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,10 @@ var LibraryManager = {
libraries.push('library_glemu.js');
}

if (STANDALONE_WASM) {
libraries.push('library_wasi.js');
}

libraries = libraries.concat(additionalLibraries);

if (BOOTSTRAPPING_STRUCT_INFO) libraries = ['library_bootstrap_structInfo.js', 'library_formatString.js'];
Expand Down
6 changes: 6 additions & 0 deletions src/preamble.js
Original file line number Diff line number Diff line change
Expand Up @@ -989,6 +989,12 @@ function createWasm() {
exports = Asyncify.instrumentWasmExports(exports);
#endif
Module['asm'] = exports;
#if STANDALONE_WASM
// In pure wasm mode the memory is created in the wasm (not imported), and
// then exported.
// TODO: do not create a Memory earlier in JS
updateGlobalBufferAndViews(exports['memory'].buffer);
#endif
#if USE_PTHREADS
// Keep a reference to the compiled module so we can post it to the workers.
wasmModule = module;
Expand Down
29 changes: 26 additions & 3 deletions src/settings.js
Original file line number Diff line number Diff line change
Expand Up @@ -1049,9 +1049,6 @@ var EMTERPRETIFY_SYNCLIST = [];
// whether js opts will be run, after the main compiler
var RUNNING_JS_OPTS = 0;

// whether we are emitting JS glue code
var EMITTING_JS = 1;

// whether we are in the generate struct_info bootstrap phase
var BOOTSTRAPPING_STRUCT_INFO = 0;

Expand All @@ -1074,6 +1071,31 @@ var USE_GLFW = 2;
// still make sense there, see that option for more details.
var WASM = 1;

// STANDALONE_WASM indicates that we want to emit a wasm file that can run without
// JavaScript. The file will use standard APIs such as wasi as much as possible
// to achieve that.
//
// This option does not guarantee that the wasm can be used by itself - if you
// use APIs with no non-JS alternative, we will still use those (e.g., OpenGL
// at the time of writing this). This gives you the option to see which APIs
// are missing, and if you are compiling for a custom wasi embedding, to add
// those to your embedding.
//
// We may still emit JS with this flag, but the JS should only be a convenient
// way to run the wasm on the Web or in Node.js, and you can run the wasm by
// itself without that JS (again, unless you use APIs for which there is no
// non-JS alternative) in a wasm runtime like wasmer or wasmtime.
//
// Note that even without this option we try to use wasi etc. syscalls as much
// as possible. What this option changes is that we do so even when it means
// a tradeoff with JS size. For example, when this option is set we do not
// import the Memory - importing it is useful for JS, so that JS can start to
// use it before the wasm is even loaded, but in wasi and other wasm-only
// environments the expectation is to create the memory in the wasm itself.
// Doing so prevents some possible JS optimizations, so we only do it behind
// this flag.
var STANDALONE_WASM = 0;

// Whether to use the WebAssembly backend that is in development in LLVM. You
// should not set this yourself, instead set EMCC_WASM_BACKEND=1 in the
// environment.
Expand Down Expand Up @@ -1638,4 +1660,5 @@ var LEGACY_SETTINGS = [
['PRECISE_I64_MATH', [1, 2], 'Starting from Emscripten 1.38.26, PRECISE_I64_MATH is always enabled (https://github.com/emscripten-core/emscripten/pull/7935)'],
['MEMFS_APPEND_TO_TYPED_ARRAYS', [1], 'Starting from Emscripten 1.38.26, MEMFS_APPEND_TO_TYPED_ARRAYS=0 is no longer supported. MEMFS no longer supports using JS arrays for file data (https://github.com/emscripten-core/emscripten/pull/7918)'],
['ERROR_ON_MISSING_LIBRARIES', [1], 'missing libraries are always an error now'],
['EMITTING_JS', [1], 'The new STANDALONE_WASM flag replaces this (replace EMITTING_JS=0 with STANDALONE_WASM=1)'],
];
67 changes: 67 additions & 0 deletions system/lib/standalone_wasm.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*
* Copyright 2019 The Emscripten Authors. All rights reserved.
* Emscripten is available under two separate licenses, the MIT license and the
* University of Illinois/NCSA Open Source License. Both these licenses can be
* found in the LICENSE file.
*/

#include <emscripten.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>

#include <wasi/wasi.h>

/*
* WASI support code. These are compiled with the program, and call out
* using wasi APIs, which can be provided either by a wasi VM or by our
* emitted JS.
*/

// libc

void exit(int status) {
__wasi_proc_exit(status);
__builtin_unreachable();
}

void abort() {
exit(1);
}

// Musl lock internals. As we assume wasi is single-threaded for now, these
// are no-ops.

void __lock(void* ptr) {}
void __unlock(void* ptr) {}

// Emscripten additions

void *emscripten_memcpy_big(void *restrict dest, const void *restrict src, size_t n) {
// This normally calls out into JS which can do a single fast operation,
// but with wasi we can't do that. As this is called when n >= 8192, we
// can just split into smaller calls.
// TODO optimize, maybe build our memcpy with a wasi variant, maybe have
// a SIMD variant, etc.
const int CHUNK = 8192;
unsigned char* d = (unsigned char*)dest;
unsigned char* s = (unsigned char*)src;
while (n > 0) {
size_t curr_n = n;
if (curr_n > CHUNK) curr_n = CHUNK;
memcpy(d, s, curr_n);
d += CHUNK;
s += CHUNK;
n -= curr_n;
}
return dest;
}

static const int WASM_PAGE_SIZE = 65536;

// Note that this does not support memory growth in JS because we don't update the JS
// heaps. Wasm and wasi lack a good API for that.
int emscripten_resize_heap(size_t size) {
size_t result = __builtin_wasm_memory_grow(0, (size + WASM_PAGE_SIZE - 1) / WASM_PAGE_SIZE);
return result != (size_t)-1;
}
46 changes: 36 additions & 10 deletions tests/test_other.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
raise Exception('do not run this file directly; do something like: tests/runner.py other')

from tools.shared import Building, PIPE, run_js, run_process, STDOUT, try_delete, listify
from tools.shared import EMCC, EMXX, EMAR, EMRANLIB, PYTHON, FILE_PACKAGER, WINDOWS, MACOS, LLVM_ROOT, EMCONFIG, EM_BUILD_VERBOSE
from tools.shared import EMCC, EMXX, EMAR, EMRANLIB, PYTHON, FILE_PACKAGER, WINDOWS, MACOS, LINUX, LLVM_ROOT, EMCONFIG, EM_BUILD_VERBOSE
from tools.shared import CLANG, CLANG_CC, CLANG_CPP, LLVM_AR
from tools.shared import NODE_JS, SPIDERMONKEY_ENGINE, JS_ENGINES, V8_ENGINE
from tools.shared import WebAssembly
Expand Down Expand Up @@ -8287,16 +8287,22 @@ def run(args, expected):
run(['-s', 'TOTAL_MEMORY=32MB', '-s', 'ALLOW_MEMORY_GROWTH=1', '-s', 'BINARYEN=1'], (2 * 1024 * 1024 * 1024 - 65536) // 16384)
run(['-s', 'TOTAL_MEMORY=32MB', '-s', 'ALLOW_MEMORY_GROWTH=1', '-s', 'BINARYEN=1', '-s', 'WASM_MEM_MAX=128MB'], 2048 * 4)

def test_wasm_targets(self):
def test_wasm_target_and_STANDALONE_WASM(self):
# STANDALONE_WASM means we never minify imports and exports.
for opts, potentially_expect_minified_exports_and_imports in (
([], False),
(['-O2'], False),
(['-O3'], True),
(['-Os'], True),
([], False),
(['-O2'], False),
(['-O3'], True),
(['-O3', '-s', 'STANDALONE_WASM'], False),
(['-Os'], True),
):
if 'STANDALONE_WASM' in opts and not self.is_wasm_backend():
continue
# targeting .wasm (without .js) means we enable STANDALONE_WASM automatically, and don't minify imports/exports
for target in ('out.js', 'out.wasm'):
expect_minified_exports_and_imports = potentially_expect_minified_exports_and_imports and target.endswith('.js')
print(opts, potentially_expect_minified_exports_and_imports, target, ' => ', expect_minified_exports_and_imports)
standalone = target.endswith('.wasm') or 'STANDALONE_WASM' in opts
print(opts, potentially_expect_minified_exports_and_imports, target, ' => ', expect_minified_exports_and_imports, standalone)

self.clear()
run_process([PYTHON, EMCC, path_from_root('tests', 'hello_world.cpp'), '-o', target] + opts)
Expand All @@ -8308,13 +8314,33 @@ def test_wasm_targets(self):
exports = [line.strip().split(' ')[1].replace('"', '') for line in wast_lines if "(export " in line]
imports = [line.strip().split(' ')[2].replace('"', '') for line in wast_lines if "(import " in line]
exports_and_imports = exports + imports
print(exports)
print(imports)
print(' exports', exports)
print(' imports', imports)
if expect_minified_exports_and_imports:
assert 'a' in exports_and_imports
else:
assert 'a' not in exports_and_imports
assert 'memory' in exports_and_imports, 'some things are not minified anyhow'
assert 'memory' in exports_and_imports or 'fd_write' in exports_and_imports, 'some things are not minified anyhow'
# verify the wasm runs with the JS
if target.endswith('.js'):
self.assertContained('hello, world!', run_js('out.js'))
# verify the wasm runs in a wasm VM, without the JS
# TODO: more platforms than linux
if LINUX and standalone and self.is_wasm_backend():
WASMER = os.path.expanduser(os.path.join('~', '.wasmer', 'bin', 'wasmer'))
if os.path.isfile(WASMER):
print(' running in wasmer')
out = run_process([WASMER, 'run', 'out.wasm'], stdout=PIPE).stdout
self.assertContained('hello, world!', out)
else:
print('[WARNING - no wasmer]')
WASMTIME = os.path.expanduser(os.path.join('~', 'wasmtime'))
if os.path.isfile(WASMTIME):
print(' running in wasmtime')
out = run_process([WASMTIME, 'out.wasm'], stdout=PIPE).stdout
self.assertContained('hello, world!', out)
else:
print('[WARNING - no wasmtime]')

def test_wasm_targets_side_module(self):
# side modules do allow a wasm target
Expand Down
Loading

0 comments on commit 5d39e70

Please sign in to comment.