Emscripten is a compiler toolchain for asm.js and WebAssembly which lets you run C and C++ on the web at near-native speed.
Emscripten output sizes have decreased a lot recently, especially for smaller programs. For example, here’s a little C code:
#include <emscripten.h> EMSCRIPTEN_KEEPALIVE int add(int x, int y) return x + y;
This is the “hello world” of pure computation: it exports a single function that adds two numbers. Compiling that with -Os -s WASM=1 (optimize for size, build to wasm), the WebAssembly binary is just 42 bytes. Disassembling it, it contains exactly what you’d expect and no more:
(module (type $0 (func (param i32 i32) (result i32))) (export "_add" (func $0)) (func $0 (; 0 ;) (type $0) (param $var$0 i32) (param $var$1 i32) (result i32) (i32.add (get_local $var$1) (get_local $var$0) ) ) )
For comparison, Emscripten 1.37.22 used to emit a WebAssembly binary of 10,837 bytes for that code sample, so the improvement to 42 bytes is dramatic. What about bigger programs? There’s a lot of improvement there too: Comparing a C hello world program using
Emscripten has mostly focused on making it easy to port existing C/C++ code. That means supporting various POSIX APIs, emulating a filesystem, and special handling of things like
ccall, etc.). And all that makes it practical to port useful APIs like OpenGL and SDL to the Web. These capabilities depend on Emscripten’s runtime and libraries, and we used to include more of those than you actually need, for two main reasons.
First, we used to export many things by default, that is, we included too many things in our output that you might use. We recently focused on changing the defaults to something more reasonable.
Things actually weren’t quite so bad before, as we did consider some connections between the two domains — enough to do a decent job for larger programs (e.g., we only include necessary JS library code, so you don’t get WebGL support if you don’t need it). But we failed to remove core runtime components when you didn’t use them, which is very noticeable in smaller programs.
More on Code Size
C hello world uses
printf, which is implemented in libc (musl in Emscripten).
printf uses libc streams code that is generic enough to handle not just printing to the console but also arbitrary devices like files, and it implements buffering and error handling, etc. It’s unreasonable to expect an optimizer to remove all that complexity — really, the issue is that if we want to just print to the console then we should use a simpler API than
One option is to use
emscripten_log, which only prints to the console, but it supports a bunch of options (like printing stack traces, formatting, etc.) so it doesn’t help that much in reducing code size. If we really want to just use
#include <emscripten.h> int main() EM_ASM( console.log("hello, world!"); ); }
- The WebAssembly loading code supports a bunch of options like using streaming if available.
- Hooks are provided to let you run code at various points in the program’s execution (just before
main(), for example). These are useful since WebAssembly startup is asynchronous.
All those are fairly important so it’s hard to just remove them. But in the future perhaps those could be made optional, and maybe we can find ways to do them in less code.
- Ongoing wasm shrinking work is happening in the Binaryen optimizer.
View original article: