Building Projects

Building large projects with Emscripten is very easy. Emscripten provides two simple scripts that configure your makefiles to use emcc as a drop-in replacement for gcc — in most cases the rest of your project’s current build system remains unchanged.

Integrating with a build system

To build using Emscripten you need to replace gcc with emcc in your makefiles. This is done using emconfigure, which sets the appropriate environment variables like CXX (C++ compiler) and CC (the compiler).

Consider the case where you normally build with the following commands:

./configure
make

To build with Emscripten, you would instead use the following commands:

# Run emconfigure with the normal configure command as an argument.
./emconfigure ./configure

# Run emmake with the normal make to generate linked LLVM bitcode.
./emmake make

# Compile the linked bitcode generated by make (project.bc) to JavaScript.
#  'project.bc' should be replaced with the make output for your project (e.g. 'yourproject.so')
#  [-Ox] represents build optimisations (discussed in the next section).
./emcc [-Ox] project.bc -o project.js

emconfigure is called with the normal configure as an argument (in configure-based build systems), and emmake with make as an argument. If your build system doesn’t use configure, then you can omit the first step.

Tip

We recommend you call both emconfigure and emmake scripts in configure-based build systems. Whether you actually need to call both tools depends on the build system (some systems will store the environment variables in the configure step, and others will not).

Make generates linked LLVM bitcode. It does not automatically generate JavaScript during linking because all the files must be compiled using the same optimizations and compiler options — and it makes sense to do this in the final conversion from bitcode to JavaScript.

Note

The file output from make might have a different suffix: .a for a static library archive, .so for a shared library, .o or .bc for object files (these file extensions are the same as gcc would use for the different types). Irrespective of the file extension, these files contain linked LLVM bitcode that emcc can compile into JavaScript in the final step.

Where possible it is better to generate shared library files (.so) rather than archives (.a) — this is generally a simple change in your project’s build system. Shared libraries are simpler, and are more predictable with respect to linking and elimination of unneeded code.

The last step is to compile the linked bitcode into JavaScript. We do this by calling emcc again, specifying the linked LLVM bitcode file as an input, and a JavaScript file as the output.

Building projects with optimizations

Emscripten performs compiler optimization at two levels: each source file is optimized by LLVM as it is compiled into an object file, and then JavaScript-specific optimizations are applied when converting object files into JavaScript.

In order to properly optimize code, it is important to use the same optimization flags and other compiler options when compiling source to object code, and object code to JavaScript (or HTML).

Consider the examples below:

# Sub-optimal - JavaScript optimizations are omitted
./emcc -O2 a.cpp -o a.bc
./emcc -O2 b.cpp -o b.bc
./emcc a.bc b.bc -o project.js

# Sub-optimal - LLVM optimizations omitted
./emcc a.cpp -o a.bc
./emcc b.cpp -o b.bc
./emcc -O2 a.bc b.bc -o project.js

# Broken! Different JavaScript and LLVM optimisations used.
./emcc -O1 a.cpp -o a.bc
./emcc -O2 b.cpp -o b.bc
./emcc -O3 a.bc b.bc -o project.js

# Correct. The SAME LLVM and JavaScript options are provided at both levels.
./emcc -O2 a.cpp -o a.bc
./emcc -O2 b.cpp -o b.bc
./emcc -O2 a.bc b.bc -o project.js

The same rule applies when building Emscripten using a build system — both LLVM and JavaScript must be optimized using the same settings.

Note

Unfortunately each build-system defines its own mechanisms for setting compiler and optimization methods. You will need to work out the correct approach to set the LLVM optimization flags for your system.

  • Some build systems have a flag like ./configure --enable-optimize.
  • You can control whether LLVM optimizations are run using --llvm-opts N where N is an integer in the range 0-3. Sending -O2 --llvm-opts 0 to emcc during all compilation stages will disable LLVM optimizations but utilize JavaScript optimizations. This can be useful when debugging a build failure.

JavaScript optimizations are specified in the final step, when you compile the linked LLVM bitcode to JavaScript. For example, to compile with -O1:

# Compile the linked bitcode to JavaScript with -O1 optimizations.
./emcc -O1 project.bc -o project.js

Building projects with debug information

Building a project containing debug information requires that debug flags are specified for both the LLVM and JavaScript compilation phases.

To make Clang and LLVM emit debug information in the bitcode files you need to compile the sources with -g (exactly the same as with clang or gcc normally). To get emcc to include the debug information when compiling the bitcode to JavaScript, specify -g or one of the -gN debug level options.

Note

Each build-system defines its own mechanisms for setting debug flags. To get Clang to emit LLVM debug information, you will need to work out the correct approach for your system.

  • Some build systems have a flag like ./configure --enable-debug.

The flags for emitting debug information when compiling from bitcode to JavaScript are specified as an emcc option in the final step:

# Compile the linked bitcode to JavaScript.
# -g or -gN can be used to set the debug level (N)
./emcc -g project.bc -o project.js

For more general information, see the topic Debugging.

Using libraries

Built in support is available for a number of standard libraries: libc, libc++ and SDL. These will automatically be linked when you compile code that uses them (you do not even need to add -lSDL, but see below for more SDL-specific details).

If your project uses other libraries, for example zlib or glib, you will need to build and link them. The normal approach is to build the libraries to bitcode and then compile library and main program bitcode together to JavaScript.

For example, consider the case where a project “project” uses a library “libstuff”:

# Compile libstuff to bitcode
./emconfigure ./configure
./emmake make

# Compile project to bitcode
./emconfigure ./configure
./emmake make

# Compile the the library and code together to HTML
emcc project.bc libstuff.bc -o final.html

It is also possible to link the bitcode libraries first, and then compile the combined .bc file to JavaScript:

# Generate bitcode files project.bc and libstuff.bc
...

# Link together the bitcode files
emcc project.bc libstuff.bc -o allproject.bc

# Compile the combined bitcode to HTML
emcc allproject.bc -o final.html

Emscripten Ports

Emscripten Ports is a collection of useful libraries, ported to Emscripten. They reside on github, and have integration support in emcc. When you request that a port be used, emcc will fetch it from the remote server, set it up and build it locally, then link it with your project, add necessary include to your build commands, etc. For example, SDL2 is in ports, and you can request that it be used with -s USE_SDL=2. For example,

./emcc tests/sdl2glshader.c -s USE_SDL=2 -s LEGACY_GL_EMULATION=1 -o sdl2.html

You should see some notifications about SDL2 being used, and built if it wasn’t previously. You can then view sdl2.html in your browser.

Note

SDL_image has also been added to ports, use it with -s USE_SDL_IMAGE=2. To see a list of all available ports, run emcc --show-ports.

Note

Emscripten also has support for older SDL1, which is built in. If you do not specify SDL2 as in the command above, then SDL1 is linked in and the SDL1 include paths are used. SDL1 has support for sdl-config, which is present in system/bin. Using the native sdl-config may result in compilation or missing-symbol errors. You will need to modify the build system to look for files in emscripten/system or emscripten/system/bin in order to use the Emscripten sdl-config.

Adding more ports

Adding more ports is fairly easy. Basically, the steps are

  • Make sure the port is open source and has a suitable license.
  • Add it to emscripten-ports on github. The ports maintainers can create the repo and add the relevant developers to a team for that repo, so they have write access.
  • Add a script to handle it under tools/ports/ (see existing code for examples) and use it in tools/ports/__init__.py.
  • Add testing in the test suite.

Build system issues

Build system self-execution

Some large projects generate executables and run them in order to generate input for later parts of the build process (for example, a parser may be built and then run on a grammar, which then generates C/C++ code that implements that grammar). This sort of build process causes problems when using Emscripten because you cannot directly run the code you are generating.

The simplest solution is usually to build the project twice: once natively, and once to JavaScript. When the JavaScript build procedure fails because a generated executable is not present, you can then copy that executable from the native build, and continue to build normally. This approach was successfully used for compiling Python (see tests/python/readme.md for more details).

In some cases it makes sense to modify the build scripts so that they build the generated executable natively. For example, this can be done by specifying two compilers in the build scripts, emcc and gcc, and using gcc just for generated executables. However, this can be more complicated than the previous solution because you need to modify the project build scripts, and you may have to work around cases where code is compiled and used both for the final result and for a generated executable.

Dynamic linking

Emscripten’s goal is to generate the fastest and smallest possible code, and for that reason it focuses on generating a single JavaScript file for an entire project.

Dynamic linking at runtime is not supported when using Fastcomp (it won’t link in code from an arbitrary location when an app is loaded).

Note

Dynamic linking would be an excellent contribution to Emscripten.

Dynamic linking is supported when using the original compiler but is not recommended.

Pseudo-Dynamic linking

Note

This section applies to the current compiler only. It is a workaround because Fastcomp does not support true dynamic linking.

Dynamic libraries that you specify in the final build stage (when generating JavaScript or HTML) are linked in as static libraries.

Emcc ignores commands to dynamically link libraries when linking together bitcode. This is to ensure that the same dynamic library is not linked multiple times in intermediate build stages, which would result in duplicate symbol errors.

Configure may run checks that appear to fail

Projects that use configure, cmake, or some other portable configuration method may run checks during the configure phase to verify that the toolchain and paths are set up properly. Emcc tries to get checks to pass where possible, but you may need to disable tests that fail due to a “false negative” (for example, tests that would pass in the final execution environment, but not in the shell during configure).

Tip

Ensure that if a check is disabled, the tested functionality does work. This might involve manually adding commands to the make files using a build system-specific method.

Note

In general configure is not a good match for a cross-compiler like Emscripten. configure is designed to build natively for the local setup, and works hard to find the native build system and the local system headers. With a cross-compiler, you are targeting a different system, and ignoring these headers etc.

Manually using emcc

The Emscripten Tutorial showed how emcc can be used to compile single files into JavaScript. Emcc can also be used in all the other ways you would expect of gcc:

# Generate a.out.js from C++. Can also take .ll (LLVM assembly) or .bc (LLVM bitcode) as input
./emcc src.cpp

# Generate src.o containing LLVM bitcode.
./emcc src.cpp -c

# Generate result.js containing JavaScript.
./emcc src.cpp -o result.js

# Generate result.bc containing LLVM bitcode (the suffix matters).
./emcc src.cpp -o result.bc

# Generate a.out.js from two C++ sources.
./emcc src1.cpp src2.cpp

# Generate src1.o and src2.o, containing LLVM bitcode
./emcc src1.cpp src2.cpp -c

# Combine two LLVM bitcode files into a.out.js
./emcc src1.o src2.o

# Combine two LLVM bitcode files into another LLVM bitcode file
./emcc src1.o src2.o -o combined.o

In addition to the capabilities it shares with gcc, emcc supports options to optimize code, control what debug information is emitted, generate HTML and other output formats, etc. These options are documented in the emcc tool reference (./emcc --help on the command line).

Alternatives to emcc

Tip

Do not attempt to bypass emcc and call the Emscripten tools directly from your build system.

You can in theory call clang, llvm-ld, and the other tools yourself. This is however considered dangerous because by default:

  • Clang does not use the Emscripten-bundled headers, which can lead to various errors.
  • llvm-ld uses unsafe/unportable LLVM optimizations.

Emcc automatically ensures the tools are configured and used properly.

Examples / test code

The Emscripten test suite (tests/runner.py) contains a number of good examples — large C/C++ projects that are built using their normal build systems as described above: freetype, openjpeg, zlib, bullet and poppler.

It is also worth looking at the build scripts in the ammo.js project.

Troubleshooting

  • Make sure to use bitcode-aware llvm-ar instead of ar (which may discard code). emmake and emconfigure set the AR environment variable correctly, but a build system might incorrectly hardcode ar.

  • The compilation error multiply defined symbol indicates that the project has linked a particular static library multiple times. The project will need to be changed so that the problem library is linked only once.

    Note

    You can use llvm-nm to see which symbols are defined in each bitcode file.

    One solution is to use the Pseudo-Dynamic linking approach described above. This ensures that libraries are linked only once, in the final build stage.