-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial WebAssembly support #457
Conversation
The users are expected to build Themis for Wasm by using "emmake" helper tool that sets up all necessary environment variables. For example, CC variable will be set to Emscripten's cross-compiler. Detect when we're building under emmake by inspecting the compiler type, provide IS_EMSCRIPTEN variable if that's the case. We're going to use this switch to customize a bunch of build steps. First of all, we need to run CMake for embedded BoringSSL using the "emconfigure" helper that sets its up properly. Additionally, we need to disable assembly usage in BoringSSL's build as it detects the platform incorrectly and tries using x86 assembly (which will not work on Wasm targets, obviously).
Some settings required for Emscripten build are not configurable (yet) by setting CMake variables, unfortunately. They require direct changes in BoringSSL's CMake configuration. Switch to our fork of BoringSSL that contains the changes we need to get the Wasm target working. Hopefully, we're going to get these changes merged into BoringSSL upstream and revert back to using the original source code.
Currently we test Wasm builds only with (embedded) BoringSSL so it makes sense to set this engine as default when building with Emscripten.
Emscripten platform is similar to Linux and has <endian.h> available. Emscripten uses 32-bit integers because of the way numbers work in JavaScript. It is also little-endian-only at the moment due to technical limitations of the implementation.
We can build Soter and Themis tests and run them, but there are a couple of caveats here. First of all, emcc output depends on file extension. File names "themis_test" and "soter_test" result in LLVM bitcode being packed in them, which is not what we want. We should use "*.js" extension to get something runnable. Introduce intermediate variables to rename the tests based on the target environment. Use "themis_test.js" for Emscripten and old "themis_test" for everything else. Additionally, we use SINGLE_FILE flag to produce a single *.js file instead of a separate *.wasm file with compiled code and a *.js loader for it. This makes it easier to launch the tests as we don't have to change directories and don't generate extra files. It also turned out that popen() function is not available in Emscripten runtime. Because of that we cannot use NIST STS as is. Disable it for now. Hopefully, we'll figure out how to use it later. The next point is that we cannot run resulting tests directly. We should use Node.js to interpret then, not execute *.js files (which is usually not possible on typical systems). Therefore, check if "node" binary is available and use it to run tests when requested. But only in Emscripten environment. If Node.js is not available then print a warning and fail the build.
Remember that 0xDEADBEEF constant which got changed to 0x1337BEEF because 32-bit systems could not handle that value? Here we are again. This value is used as a 'corrupted' message length value. Eventually it will be used to allocate appropriate amount of memory so that the test can fail successfully. Real 32-bit Linux systems can allocate 0x1337BEEF bytes of memory just fine. It is possible due to the way memory allocation works on Linux with glibc. However, Wasm runtime has much more strict memory limits and is not able to reserve the memory with lazy mapping the way Linux does it. Furthermore, instead of returning NULL from malloc() the runtime aborts by default. Lower the value of this constant again. 16 KB should be big enough to be considered invalid, but small enough to be successfully allocated.
In order to compile Wasm on CircleCI we need to install Emscripten toolchain. Do that in accordance with the official docs. After that the build should just work out of the box. Take care to cache as much as possible so that we don't have to download LLVM and recompile the C standard library each build.
Fix incorrectly populated BASH_ENV value. We should not use relative paths there as the file is going to be loaded from everywhere.
Also add redirect to /dev/null of script output because it gets annoying to see it with every Bash invocation (which are quite often during the build). Just use it to set the variables.
Do not use "-S" and "-B" options because the CMake we have in Docker environment does not support them. Use the old approach of changing directory to the build tree and running CMake from there.
Actually, don't activate Emscripten environment for all the build. Do this only for the step where we run Wasm tests. Activating Emscripten environment adds Clang and LLVM into PATH which breaks Rust code compilation that relies on its own particular Clang and LLVM present.
Make sure that submodule URL is updated before trying to pull changes. This ensures that CI is fetching the correct submodule even if it caches .git directory somewhere.
Okay, it seems I have finally gotten the code right. But now CircleCI refuses to update BoringSSL submodule correctly. The URL seems to be updated to our fork, but the submodule commit itself is still the old one. It seems to be caused by CircleCI's cache mumbo-jumbo, still looking for a button to nuke all the caches... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sooooo excited!
This will be huuuge! 🤟
@@ -23,7 +23,7 @@ | |||
|
|||
#endif | |||
|
|||
#if defined(__linux__) || defined(__CYGWIN__) | |||
#if defined(__linux__) || defined(__CYGWIN__) || defined(__EMSCRIPTEN__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at this point I realized that we actually have portable_endian
header O_O
@@ -26,7 +26,7 @@ | |||
#define MAX_MESSAGE_SIZE 4096 | |||
|
|||
/* Keep it under 2^31 to support 32-bit systems. */ | |||
#define CORRUPTED_LENGTH 0x1337BEEF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an era has gone :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just occured to me that simply 0xBEEF
, or 0xC0DE
, or maybe even 0x31337
should be fine. Got any favorites?
Continue using hexspeak for constant, just use something "safe", teehee...
For some reason, CircleCI likes git "git add" the submodule after checking out the source, thus preventing it from being updated normally when "git submodule update" is run. Add a "git reset" call before updating submodules. This seems to work, based on my attempts at debugging the build via SSH.
Adding a |
yeeeeah! 🎉 |
This patch set enables Themis to be compiled using standard Emscripten toolchain for WebAssembly targets:
The resulting library works fine and passes all unit tests (see x86_64 target on CircleCI). However, that's it: we can compile the library and it seems to work by itself. However, we're still far from WebAssembly support and it's still considered experimental.
For example, it would be very interesting to see if the integration tests pass because Wasm is a 32-bit target. Also, currently only embedded BoringSSL is supported for cryptographic backend, and even that requires specific tweaks.
The users are expected to build Themis for Wasm by using
emmake
helper tool that sets up all necessary environment variables. For example, CC variable will be set to Emscripten's cross-compiler.Detect when we're building under emmake by inspecting the compiler type, provide IS_EMSCRIPTEN variable if that's the case. We're going to use this switch to customize a bunch of build steps.
First of all, we need to run CMake for embedded BoringSSL using the
emconfigure
helper that sets its up properly. Additionally, we need to disable assembly usage in BoringSSL's build as it detects the platform incorrectly and tries using x86 assembly (which will not work on Wasm targets, obviously).Some settings required for Emscripten build are not configurable (yet) by setting CMake variables, unfortunately. They require direct changes in BoringSSL's CMake configuration. Switch to our fork of BoringSSL that contains the changes we need to get the Wasm target working.
Hopefully, we're going to get these changes merged into BoringSSL upstream and revert back to using the original source code.
Currently we test Wasm builds only with (embedded) BoringSSL so it makes sense to set this engine as default when building with Emscripten.
Emscripten platform is similar to Linux and has
<endian.h>
available. Emscripten uses 32-bit integers because of the way numbers work in JavaScript. It is also little-endian-only at the moment due to technical limitations of the implementation.We can build Soter and Themis tests and run them, but there are a couple of caveats here.
First of all, emcc output depends on file extension. File names "themis_test" and "soter_test" result in LLVM bitcode being packed in them, which is not what we want. We should use "*.js" extension to get something runnable. Introduce intermediate variables to rename the tests based on the target environment. Use
themis_test.js
for Emscripten and oldthemis_test
for everything else.Additionally, we use SINGLE_FILE flag to produce a single *.js file instead of a separate *.wasm file with compiled code and a *.js loader for it. This makes it easier to launch the tests as we don't have to change directories and don't generate extra files.
It also turned out that
popen()
function is not available in Emscripten runtime. Because of that we cannot use NIST STS as is. Disable it for now. Hopefully, we'll figure out how to use it later.The next point is that we cannot run resulting tests directly. We should use Node.js to interpret then, not execute *.js files (which is usually not possible on typical systems). Therefore, check if
node
binary is available and use it to run tests when requested. But only in Emscripten environment. If Node.js is not available then print a warning and fail the build.Remember that
0xDEADBEEF
constant which got changed to0x1337BEEF
because 32-bit systems could not handle that value? Here we are again.This value is used as a 'corrupted' message length value. Eventually it will be used to allocate appropriate amount of memory so that the test can fail successfully.
Real 32-bit Linux systems can allocate 0x1337BEEF bytes of memory just fine. It is possible due to the way memory allocation works on Linux with glibc. However, Wasm runtime has much more strict memory limits and is not able to reserve the memory with lazy mapping the way Linux does it. Furthermore, instead of returning NULL from malloc() the runtime aborts by default.
Lower the value of this constant again. 16 KB should be big enough to be considered invalid, but small enough to be successfully allocated.
In order to compile Wasm on CircleCI we need to install Emscripten toolchain. Do that in accordance with the official docs. After that the build should just work out of the box. Take care to cache as much as possible so that we don't have to download LLVM and recompile the C standard library each build.