Skip to content

Compressing Downloads

kripken edited this page Mar 26, 2012 · 5 revisions

Downloading large files is not that great on the web. To help with that Emscripten lets you compress both your compiled code and the data files you use with it.

As an example, let's add compression to the file example from the tutorial. First, build the native encoder for LZMA, with this:

cd third_party/lzma.js
./doit.sh

Then go back to the emscripten root directory and run this:

./emcc tests/hello_world_file.cpp -o hello.html --preload-file tests/hello_world_file.txt \
       --compression third_party/lzma.js/lzma-native,third_party/lzma.js/lzma-decoder.js,LZMA.decompress

The only change from before is the --compression option (we'll explain the syntax in a bit). The result of adding that option is that the generated HTML file, hello.html, will contain the JS code to decompress and commands to download the compressed code from a separate file, hello.js.compress (the original, hello.js, should also now exist; you can compare their sizes). The file we asked to be preloaded has been compressed as tests/hello_world_file.txt.compress and the compiled code will load it and decompress it. In this example LZMA is used to compress, which reduces code size to one quarter the original, and the data file size is reduced by a small amount (it is a small file to begin with so it is harder to compress).

--compress allows you to easily plug in your own compression codecs, with the following syntax:

    --compress <native_encoder>,<js_decoder>,<js_name>
  • native_encoder is a native executable that compresses stdin to stdout (the simplest possible interface). It is used locally to compress the files.
  • js_decoder is a JavaScript file that implements a decoder that matches the encoder. The implementation should include a function that when called with an array (or a typed array) returns an array (or a typed array), in both cases representing bytes.
  • js_name is the name of the function to call in the JavaScript decoder that decompresses.

In the example above we use lzma.js, a compact and high-performing GPL-licensed LZMA implementation based on lzip. You can view the source of lzma-decoder.js there to see the implementation of LZMA.decompress which is specified as the decoder function to be called.

Notes:

  • It is very easy to use any compression scheme you want. As explained above, all you need is a native implementation that compresses and a JavaScript implementation that decompresses, and that's it.
  • Decompressing is done in a web worker. This keeps the main UI thread responsive even while decompressing a large file. (Requiring web workers limits the browsers than can run compressed code, however, in almost all cases you will want typed arrays anyhow and the set of browsers that support those two features is almost identical.)

Preparing for Production

Running with --compression generates files in the current directory tree. To prepare them for use in production, all you need are the following files:

  • The generated HTML file
  • decompress.js, which will contain the decompressor code plus glue to make it work in a web worker
  • The compressed code, which has suffix .js.compress.
  • The compressed data files (all those specified to be preloaded), which are in a single archive with suffix .data.compress.
Clone this wiki locally