Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial WebAssembly support #457

Merged
merged 14 commits into from
Apr 16, 2019
Merged

Initial WebAssembly support #457

merged 14 commits into from
Apr 16, 2019

Conversation

ilammy
Copy link
Collaborator

@ilammy ilammy commented Apr 10, 2019

This patch set enables Themis to be compiled using standard Emscripten toolchain for WebAssembly targets:

emmake make test

The resulting library works fine and passes all unit tests (see x86_64 target on CircleCI). However, that's it: we can compile the library and it seems to work by itself. However, we're still far from WebAssembly support and it's still considered experimental.

For example, it would be very interesting to see if the integration tests pass because Wasm is a 32-bit target. Also, currently only embedded BoringSSL is supported for cryptographic backend, and even that requires specific tweaks.


  • Detect and support Emscripten builds

The users are expected to build Themis for Wasm by using emmake helper tool that sets up all necessary environment variables. For example, CC variable will be set to Emscripten's cross-compiler.

Detect when we're building under emmake by inspecting the compiler type, provide IS_EMSCRIPTEN variable if that's the case. We're going to use this switch to customize a bunch of build steps.

First of all, we need to run CMake for embedded BoringSSL using the emconfigure helper that sets its up properly. Additionally, we need to disable assembly usage in BoringSSL's build as it detects the platform incorrectly and tries using x86 assembly (which will not work on Wasm targets, obviously).

  • Switch to CossackLabs fork of BoringSSL

Some settings required for Emscripten build are not configurable (yet) by setting CMake variables, unfortunately. They require direct changes in BoringSSL's CMake configuration. Switch to our fork of BoringSSL that contains the changes we need to get the Wasm target working.

Hopefully, we're going to get these changes merged into BoringSSL upstream and revert back to using the original source code.

  • Use BoringSSL as default engine for Emscripten

Currently we test Wasm builds only with (embedded) BoringSSL so it makes sense to set this engine as default when building with Emscripten.

  • Add Emscripten support to themis/portable_endian.h

Emscripten platform is similar to Linux and has <endian.h> available. Emscripten uses 32-bit integers because of the way numbers work in JavaScript. It is also little-endian-only at the moment due to technical limitations of the implementation.

  • Support Emscripten in Soter and Themis tests

We can build Soter and Themis tests and run them, but there are a couple of caveats here.

First of all, emcc output depends on file extension. File names "themis_test" and "soter_test" result in LLVM bitcode being packed in them, which is not what we want. We should use "*.js" extension to get something runnable. Introduce intermediate variables to rename the tests based on the target environment. Use themis_test.js for Emscripten and old themis_test for everything else.

Additionally, we use SINGLE_FILE flag to produce a single *.js file instead of a separate *.wasm file with compiled code and a *.js loader for it. This makes it easier to launch the tests as we don't have to change directories and don't generate extra files.

It also turned out that popen() function is not available in Emscripten runtime. Because of that we cannot use NIST STS as is. Disable it for now. Hopefully, we'll figure out how to use it later.

The next point is that we cannot run resulting tests directly. We should use Node.js to interpret then, not execute *.js files (which is usually not possible on typical systems). Therefore, check if node binary is available and use it to run tests when requested. But only in Emscripten environment. If Node.js is not available then print a warning and fail the build.

  • Fix broken Secure Cell tests

Remember that 0xDEADBEEF constant which got changed to 0x1337BEEF because 32-bit systems could not handle that value? Here we are again.

This value is used as a 'corrupted' message length value. Eventually it will be used to allocate appropriate amount of memory so that the test can fail successfully.

Real 32-bit Linux systems can allocate 0x1337BEEF bytes of memory just fine. It is possible due to the way memory allocation works on Linux with glibc. However, Wasm runtime has much more strict memory limits and is not able to reserve the memory with lazy mapping the way Linux does it. Furthermore, instead of returning NULL from malloc() the runtime aborts by default.

Lower the value of this constant again. 16 KB should be big enough to be considered invalid, but small enough to be successfully allocated.

  • Run Emscripten tests on CirleCI

In order to compile Wasm on CircleCI we need to install Emscripten toolchain. Do that in accordance with the official docs. After that the build should just work out of the box. Take care to cache as much as possible so that we don't have to download LLVM and recompile the C standard library each build.

The users are expected to build Themis for Wasm by using "emmake"
helper tool that sets up all necessary environment variables.
For example, CC variable will be set to Emscripten's cross-compiler.

Detect when we're building under emmake by inspecting the compiler type,
provide IS_EMSCRIPTEN variable if that's the case. We're going to use
this switch to customize a bunch of build steps.

First of all, we need to run CMake for embedded BoringSSL using the
"emconfigure" helper that sets its up properly. Additionally, we need
to disable assembly usage in BoringSSL's build as it detects the
platform incorrectly and tries using x86 assembly (which will not work
on Wasm targets, obviously).
Some settings required for Emscripten build are not configurable (yet)
by setting CMake variables, unfortunately. They require direct changes
in BoringSSL's CMake configuration. Switch to our fork of BoringSSL
that contains the changes we need to get the Wasm target working.

Hopefully, we're going to get these changes merged into BoringSSL
upstream and revert back to using the original source code.
Currently we test  Wasm builds only with (embedded) BoringSSL so it
makes sense to set this engine as default when building with Emscripten.
Emscripten platform is similar to Linux and has <endian.h> available.
Emscripten uses 32-bit integers because of the way numbers work in
JavaScript. It is also little-endian-only at the moment due to technical
limitations of the implementation.
We can build Soter and Themis tests and run them, but there are a couple
of caveats here.

First of all, emcc output depends on file extension. File names
"themis_test" and "soter_test" result in LLVM bitcode being packed in
them, which is not what we want. We should use "*.js" extension to get
something runnable. Introduce intermediate variables to rename the tests
based on the target environment. Use "themis_test.js" for Emscripten and
old "themis_test" for everything else.

Additionally, we use SINGLE_FILE flag to produce a single *.js file
instead of a separate *.wasm file with compiled code and a *.js loader
for it. This makes it easier to launch the tests as we don't have to
change directories and don't generate extra files.

It also turned out that popen() function is not available in Emscripten
runtime. Because of that we cannot use NIST STS as is. Disable it for
now. Hopefully, we'll figure out how to use it later.

The next point is that we cannot run resulting tests directly. We should
use Node.js to interpret then, not execute *.js files (which is usually
not possible on typical systems). Therefore, check if "node" binary is
available and use it to run tests when requested. But only in Emscripten
environment. If Node.js is not available then print a warning and fail
the build.
Remember that 0xDEADBEEF constant which got changed to 0x1337BEEF
because 32-bit systems could not handle that value? Here we are again.

This value is used as a 'corrupted' message length value. Eventually it
will be used to allocate appropriate amount of memory so that the test
can fail successfully.

Real 32-bit Linux systems can allocate 0x1337BEEF bytes of memory just
fine. It is possible due to the way memory allocation works on Linux
with glibc. However, Wasm runtime has much more strict memory limits
and is not able to reserve the memory with lazy mapping the way Linux
does it. Furthermore, instead of returning NULL from malloc() the
runtime aborts by default.

Lower the value of this constant again. 16 KB should be big enough to
be considered invalid, but small enough to be successfully allocated.
In order to compile Wasm on CircleCI we need to install Emscripten
toolchain. Do that in accordance with the official docs. After that
the build should just work out of the box. Take care to cache as much
as possible so that we don't have to download LLVM and recompile the
C standard library each build.
@ilammy ilammy added infrastructure Automated building and packaging W-WasmThemis 🌐 Wrapper: WasmThemis, JavaScript API, npm packages labels Apr 10, 2019
Fix incorrectly populated BASH_ENV value. We should not use relative
paths there as the file is going to be loaded from everywhere.
Also add redirect to /dev/null of script output because it gets
annoying to see it with every Bash invocation (which are quite often
during the build). Just use it to set the variables.
Do not use "-S" and "-B" options because the CMake we have in Docker
environment does not support them. Use the old approach of changing
directory to the build tree and running CMake from there.
Actually, don't activate Emscripten environment for all the build.
Do this only for the step where we run Wasm tests.

Activating Emscripten environment adds Clang and LLVM into PATH which
breaks Rust code compilation that relies on its own particular Clang
and LLVM present.
Make sure that submodule URL is updated before trying to pull changes.
This ensures that CI is fetching the correct submodule even if it
caches .git directory somewhere.
@ilammy
Copy link
Collaborator Author

ilammy commented Apr 11, 2019

Okay, it seems I have finally gotten the code right. But now CircleCI refuses to update BoringSSL submodule correctly. The URL seems to be updated to our fork, but the submodule commit itself is still the old one. It seems to be caused by CircleCI's cache mumbo-jumbo, still looking for a button to nuke all the caches...

Copy link
Contributor

@vixentael vixentael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sooooo excited!

This will be huuuge! 🤟

@@ -23,7 +23,7 @@

#endif

#if defined(__linux__) || defined(__CYGWIN__)
#if defined(__linux__) || defined(__CYGWIN__) || defined(__EMSCRIPTEN__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point I realized that we actually have portable_endian header O_O

@@ -26,7 +26,7 @@
#define MAX_MESSAGE_SIZE 4096

/* Keep it under 2^31 to support 32-bit systems. */
#define CORRUPTED_LENGTH 0x1337BEEF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an era has gone :(

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occured to me that simply 0xBEEF, or 0xC0DE, or maybe even 0x31337 should be fine. Got any favorites?

Continue using hexspeak for constant, just use something "safe", teehee...
For some reason, CircleCI likes git "git add" the submodule after
checking out the source, thus preventing it from being updated
normally when "git submodule update" is run. Add a "git reset" call
before updating submodules. This seems to work, based on my attempts
at debugging the build via SSH.
@ilammy
Copy link
Collaborator Author

ilammy commented Apr 12, 2019

Adding a git reset HEAD tightened some screws in CircleCI's git operations and it managed to check out the correct working copy. I'm so relieved to finally see the build go green not only on my machine.

@vixentael
Copy link
Contributor

yeeeeah! 🎉

@ilammy ilammy mentioned this pull request Apr 12, 2019
2 tasks
@ilammy ilammy merged commit b759b00 into master Apr 16, 2019
@ilammy ilammy deleted the ilammy/initial-wasm branch April 16, 2019 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Automated building and packaging W-WasmThemis 🌐 Wrapper: WasmThemis, JavaScript API, npm packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants