Skip to content
Amit Patel edited this page Jun 6, 2013 · 99 revisions

#FAQ

General

  • Q. What does Emscripten do?

    A. Emscripten compiles LLVM bytecode into JavaScript, which then allows:

    • Compiling C/C++ and other code that can be translated into LLVM, directly into JavaScript.
    • Compiling the C/C++ runtimes of other languages into JavaScript, and then running code in those other languages in an indirect way. This works for languages like Python and Lua.
  • Q. Why are you doing this?

    A. The web is standards-based, cross-platform, runs everywhere from PCs to iPads, and has numerous independent compatible implementations. It's arguably the best platform to develop for, for those reasons. But it could be even more developer-friendly: While JavaScript (when used well!) is an excellent language, lots of people want to code in other languages. By compiling to JavaScript, everyone is happy.

  • Q. What is the status of Emscripten?

    A. Emscripten is mature and has been used to port a very long list of real-world codebases to JavaScript, including large projects like CPython, Poppler and Bullet. You can see some demos here.

    However, there are some unavoidable limitations, since JavaScript is not native code: CodeGuidelinesAndLimitations

  • Q. How fast will the compiled code be?

    A. Emscripten can generate JavaScript in asm.js format, which is a subset of JavaScript designed to make it possible for JavaScript engines to execute very quickly (compile with -s ASM_JS=1 to get asm.js-style output). With optimizations for asm.js, current results are execution 2X slower than native code (clang -O2); without such optimizations in a JS engine, results are around 4X on small benchmarks and 5X-10X on large codebases, see more details in this presentation. The bottom line is that right now, speed is generally 4X-5X slower than native, but JavaScript engine improvements will soon bring that closer to 2X.

    To run the emscripten benchmark suite yourself, do python tests/runner.py benchmark.

  • Q. How big will the compiled code be?

    A. The effective size of the code will be about the same as native code. That is, if you gzip your code, it will be about the same size as gzipped native code. For more, see this blog post.

  • Q. What is the compiler written in?

    A. JavaScript. Paralleling the language we are generating code for has various benefits. For example, if we determine some expression can be known at compile time, we can do evaluate it immediately in the compiler; otherwise we can simply JSON.stringify() it for the generated code to solve at runtime. Also, (nice) JavaScript is cool.

  • Q. Isn't it better just to write JavaScript code? Why compile LLVM into JavaScript?

    A. By all means write new JavaScript code. Emscripten is just another option to have, and will hopefully be useful if you have a lot of C/C++ code that you don't want to rewrite from scratch. You can still write web applications normally, but Emscripten lets you integrate existing C/C++ code when useful.

  • Q. Where does Emscripten itself run?

    A. Emscripten is known to work on Windows, OS X and Linux. Note however that currently the automatic tests are run mainly on Linux, which means it is the most stable platform for Emscripten. Help with supporting other platforms would be very welcome.

    That's for the compiler, of course, the generated code is valid JavaScript, so it will run anywhere JavaScript can run. By default typed arrays are used (both for code compatibility and speed), however typed arrays are not universally supported yet, so Emscripten lets you also generate code without typed arrays, which will run practically everywhere.

  • Q. What APIs/libraries does Emscripten support?

    A. libc and stdlibc++ support is very good. SDL support is sufficient to run quite a lot of code. OpenGL support is in very good shape for OpenGL ES 2.0-type code, and even some other types, see OpenGL support.

  • Q. Is this really a compiler? Isn't it better described as a translator?

    A. Well, a compiler is usually defined as a program that transforms source code written in one programming language into another, which is what Emscripten does. A translator is a more specific term that is usually used for compilation between high-level languages, which isn't exactly applicable. On the other hand a decompiler is something that translates a low-level language to a higher-level one, so that might technically be a valid description, but it sounds odd since we aren't going back to the original language we compiled from (C/C++, most likely) but into something else (JavaScript).

  • Q. The name of the project sounds weird to me.

    A. I don't know why; it's a perfectly cromulent word!

Using Emscripten

  • Q. How do I compile code?

    A. See the Tutorial.

  • Q. I get lots of errors building the tests or even simple hello world type stuff?

    A. Some common problems are:

    • Using older versions of Node or JS engines. Use the versions mentioned in the Tutorial.
    • Using older versions of LLVM. The recommended LLVM version is the 3.0 release. Using LLVM trunk might or might not work.
    • Typos in the paths ~/.emscripten.
    • Not having python2 in your system. For compatibility with systems that install python 2.x alongside 3.x (increasingly common), we look for python2. If you only have python 2.x installed, make python2 be a link to python. Or, instead you can invoke our python scripts directly, for example python emcc instead of ./emcc.

    You might also want to go through the Tutorial again, if it's been a while since you have (we update it when things change).

  • Q. Can I compile my project using Emscripten? Do I need a new build system?

    A. You can in most cases very easily use your project's current build system with Emscripten. See Building-Projects.

  • Q. My code compiles slowly.

    A. Emscripten makes some tradeoffs that make the generated code faster and smaller, at the cost of longer compilation times. For example, we build parts of the standard library along with your code which enables some additional optimizations, but takes a little longer to compile.

    Emscripten can run all the big compilation phases in parallel, and will do so automatically, so running on a machine with more cores can give you almost a linear speedup (so doubling the amount of cores can almost halve the amount of time it takes to build and so forth). To see details of how work is parallelized, compile with EMCC_DEBUG=1 in the environment (note though that in that debug mode compilation takes longer than normal, because we print out a lot of intermediate steps to disk, by default to /tmp/emscripten_temp, but it's still useful to see which stages are slowing you down).

    For incremental builds on large codebases (for example, where you compile, then change a few lines and recompile), Emscripten has an option to use caching to greatly speed itself up. Run emcc with --jcache to enable the caching. It must be enabled on both the first, full build, and the later incremental build.

    Note that optimization can in some cases be noticeably slower than unoptimized code, -O1 is slower than -O0, which in turn is slower than -O2 (in return, though, they greatly improve the speed of the generated code). It might be useful to use -O0 (or not specify an optimization level) during quick development iterations and to do fully optimized builds less frequently.

    Currently builds with line-number debug info (where the source code was compiled with -g) are slow, see issue #216. Stripping the debug info leads to much faster compile times.

  • Q. When I compile code that should work, I get odd errors in Emscripten about various things. I get different errors (or it works) on another machine.

    A. Make sure you are using the Emscripten bundled system headers. Using emcc will do so by default, but if you compile into LLVM bitcode yourself, or you use your local system headers even with emcc, problems can happen.

  • Q. My code fails to compile, the error includes something about inline assembly (or {"text":"asm"}).

    A. Emscripten cannot compile inline assembly code, which is CPU specific, because Emscripten is not a CPU emulator.

    Many projects have build options that generate only platform-independent code, without inline assembly. That should be used for Emscripten. For example, the following might help (and are done automatically for you by emcc):

    #undef __i386__
    #undef __x86_64__
    

    Since when no CPU-specific #define exists, many projects will not generate CPU specific code. In general though, you will need to find where inline assembly is generated, and how to disable that.

  • Q. How do I run an event loop?

    A. To run a C function repeatedly, use emscripten_set_main_loop, see system/include/emscripten.h. The other functions in that file are also useful, they let you do things like add events that block the main loop, etc. Documentation for all of those functions is in that header file.

    To respond to browser events and so forth, use the SDL API normally. See the SDL tests for examples (look for SDL in tests/runner.py).

    See also the next question.

  • Q. My HTML app hangs.

    A. Graphical C++ apps typically have a main loop that is an infinite loop, in which event handling is done, processing and rendering, then SDL_Delay. However, in JS there is no way for SDL_Delay to actually return control to the browser event loop. To do that, you must exit the current code. See Emscripten-Browser-Environment.

  • Q. My SDL app doesn't work.

    A. See the SDL automatic tests for working examples: python tests/runner.py browser.

  • Q. How do I link against system libraries like SDL, boost, etc.?

    A. System libraries that are included with emscripten - libc, libc++ (C++ STL) and SDL - are automatically included when you compile (and just the necessary parts of them). You don't even need -lSDL, unlike other compilers (but -lSDL won't hurt either).

    Other libraries not included with emscripten, like boost, you would need to compile yourself and link with your program, just as if they were a module in your project. For example, see how BananaBread links in libz. (Note that in the specific case of boost, if you only need the boost headers, you don't need to compile anything.)

    Another option for libraries not included is to implement them as a JS library, like emscripten does for libc (minus malloc) and SDL (but not libc++ or malloc). See --js-library in emcc.

  • Q. How can my compiled program access files?

    A. Emscripten uses a virtual file system that may be preloaded with data or linked to URLs for lazy loading. See the Filesystem Guide for more details.

  • Q. I get an error trying to access __tm_struct_layout (or another C structure used in libc).

    A. You may need to compile the source code with emcc -g. -g tells the compiler to include debug info, which includes metadata about structures which is used to access those structures from Emscripten's JS libc implementation. (Adding -g is a workaround until we have a proper fix for this.)

  • Q. Functions in my C/C++ source code vanish when I compile to JavaScript..?

    A. By default Emscripten does dead code elimination to minimize code size. However, it might end up removing functions you want to call yourself, that are not called from the compiled code (so the LLVM optimizer thinks they are unneeded). You can run emcc with -s LINKABLE=1 which will disable link-time optimizations and dead code elimination, but this makes the code larger and less optimized than it could be. Instead, you should prevent specific functions from being eliminated by adding them to EXPORTED_FUNCTIONS (see src/settings.js), for example, run emcc with something like -s EXPORTED_FUNCTIONS="['_main', '_my_func']" in order to keep my_func from being removed/renamed (as well as main())).

    It can be useful to compile with EMCC_DEBUG=1 (EMCC_DEBUG=1 emcc ..). Then the compilation steps are split up and saved in /tmp/emscripten_temp. You can then see at what stage the code vanishes (you will need to do llvm-dis on the bitcode stages to read them, or llvm-nm, etc.).

    One possible cause of vanishing code is an LLVM LTO bug. If that happens, you will see the code vanish in the LTO stage when using EMCC_DEBUG=1. You can turn LTO off with --llvm-lto 0 passed to emcc , or setting LINKABLE to 1 as mentioned before.

    In summary, the general procedure for making sure a function is accessible to be called from normal JS later is (1) make a C function interface (to avoid C++ name mangling), (2) run emcc with -s EXPORTED_FUNCTIONS="['_main', '_yourCfunc']" to make sure it is kept alive during optimization.

    Note: In LLVM 3.2 dead code elimination is significantly more aggressive. All functions not kept alive through EXPORTED_FUNCTIONS will be potentially eliminated. Make sure to keep the things you need alive using one or both of those methods.

  • Q. The FS API is not available when I build with closure?

    A. Closure compiler will minify the FS API code. To write code that uses it, it must be optimized with the FS API code by closure. To do that, use emcc's --pre-js option, see emcc --help.

  • Q. My code breaks with -O2 and above, giving odd errors..?

    A. The likely problem is that Closure Compiler (which runs in -O2 and above by default) minifies variable names. Names like i,j,xa can be generated, and if other code has such variables in the global scope, bad things can happen.

    To check if this is the problem, compile with -O2 --closure 0. If that works, name minification might be the problem. If so, wrapping the generated code in a closure should fix it. (Or, wrap your other code in a closure, or stop it from using small variable names in the global scope, you might be using such variables by mistake by forgetting a var and assigning to a variable - which makes it be in the global scope.)

    To 'wrap' code in a closure, do something like this:

var CompiledModule = (function() {
  .. GENERATED CODE ..
  return Module;
})();
  • Q. I get undefined is not a function or NAME is not a function..?

    A. The likely cause is an undefined function - something that was referred to, but not implemented or linked in. If you get undefined, look at the line number to see the function name.

    Emscripten by default does not give fatal errors on undefined symbols, so you can get runtime errors like these (because in practice in many codebases it is easiest to get them working without refactoring them to remove all undefined symbol calls). If you prefer compile-time notifications, run emcc with -s WARN_ON_UNDEFINED_SYMBOLS=1 or -s ERROR_ON_UNDEFINED_SYMBOLS=1.

    Aside from just forgetting to link in a necessary object file, one possible cause for this error is inline functions in headers. If you have a header with inline int my_func() { .. } then clang may not actually inline the function (since inline is just a hint), and also not generate code for it (since it's in a header), so the generated bitcode and js will not have that function implemented. One solution is to add static, that is static inline int my_func() { .. } which forces code to be generated in the object file.

  • Q. I get an odd python error complaining about libcxx.bc or libcxxabi.bc..?

    A. Possibly building libcxx or libcxxabi failed. Go to system/lib/libcxx (or libcxxabi) and do emmake make to see the actual error. Or, clean the emscripten cache (~/.emscripten_cache) and then compile your file with EMCC_DEBUG=1 in the environment. libcxx will then be built in /tmp/emscripten_temp/libcxx, and you can see configure*,make* files that are the output of configure and make, etc.

    One possible cause of this error is the lack of make, which is necessary to build these libraries. If you are on Windows, you need cygwin which supplies make.

  • Q. Running LLVM bitcode generated by emcc through lli breaks with errors about impure_ptr stuff..?

    A. First of all, lli is not maintained (sadly) and has odd errors and crashes. However there is tools/nativize_llvm.py which compiles bitcode to a native executable. It will also hit the impure_ptr error though.

    The issue is that newlib uses that impure pointer stuff, while glibc uses something else. So bitcode build with the emscripten SDK (which emcc does) will not run locally, unless your machine uses newlib (which basically only embedded systems do). The impure_ptr stuff is limited, however, it only applies to explicit use of stdout etc. So printf(..) will work, but fprintf(stdout, ..) will not. So often it is simple to modify your code to not hit this problem.

  • Q. I get a stack size error when optimizing (RangeError: Maximum call stack size exceeded or similar)?

    A. You may need to increase the stack size for node. On linux and mac, you can just do NODE_JS = ['node', '--stack_size=8192'] or such. On windows, you will also need --max-stack-size=8192, and also to run editbin /stack:33554432 node.exe.

  • Q. I get error: cannot compile this aggregate va_arg expression yet and it says compiler frontend failed to generate LLVM bitcode, halting afterwards.

    A. This is a limitation of the le32 frontend in clang. You can use the x86 frontend instead by compiling with EMCC_LLVM_TARGET=i386-pc-linux-gnu in the environment (however you will lose the advantages of le32 which includes better alignment of doubles).