WebContainer

Current Working Draft

Introduction

This is the specification for WebContainer, a container format originally created at Wasmer in 2021 for describing the capabilities and requirements of lightweight containers runnable in any kind of environments, including browsers and servers. The development of this open standard started in 2019 as a way to solve package filesystem and dependencies within WAPM and solidified with StackBlitz implementation of Node.js natively running in the browser.

If your organization benefits from WebContainer, please consider becoming a member and helping us to sustain the activities that support the health of our neutral ecosystem.

The WebContainer Specification Project may continue to evolve in future editions of this specification. Previous editions of the WebContainer specification can be found at permalinks that match their release tag. The latest working draft release can be found at https://spec.webcontainer.org/draft.

1Overview

A WebContainer is a directory with a standardized hierarchical structure that holds WebAssembly modules and the resources used by those modules. The metadata of this structure is stored in the manifest. The container contains resources that may be accessed at runtime, such as text files, images, audio files, and configuration files among other things.

A WebContainer is agnostic regarding the WebAssembly runtime where it will run (as long as it runs conforming to the WebAssembly spec). It mainly defines a filesystem, modules within that filesystem, metadata for each of those modules (where we define dependencies, permissions, interface requirements) and an optional signature attached to the container, so we can verify it.

Similarly to WebAssembly, WebContainers are designed to separate the binary layer (WebAssembly Modules) from the data layer (the resources: files, configurations etc). The reason motivating this design is two fold: security and usability. We will not require recompilation of the binary to change some configuration files.

2WebContainer Design

Similarly to WebAssembly, WebContainers are designed to separate the binary layer (WebAssembly Modules), from the data/filesystem layer (the resources: files, configurations etc).

The reason is two fold: security and usability. We will not require recompilation of the binary to change some configuration files.

2.1Why we need WebContainers?

Running complex libraries and applications require sometimes access to base configuration files, resources or so on to be able to fully run.

For example, the Python runtime needs access to the library python resources at runtime. Those resources are usually separated from the executable itself, since it allows for reusability and debuggability.

Because of that, we realized that a new container specification is needed in order to enable truly universal execution across a variety of devices and ecosystems.

2.2Technical Requirements

The WebContainer specification is designed with the following requirements in mind:

  • They should be easily usable from any environment: from browsers to servers, IoT devices and cloud clusters,
  • They should take minimum memory to construct and read,
  • They should include metadata about the WebAssembly files (what kind of ABI are they using, a hash of their data, dependencies…),
  • They should be immutable,
  • They should allow for incremental, lazy or random reading (seek, streaming),
  • They should be able to include a filesystem on them:
    • They should have optimal random access to files in that filesystem,
  • They should be verifiable (signable), and should assure that sign(sign(package)) == sign(package),
  • They should be not tied to one specific execution system (Wasm: WASI, Emscripten, ONNX or Tensorflow Inferencers, etc.). They should support multiple ways of running executables.
  • It should pre-define the capabilities needed when running a container.

We also decided to keep some requirements out to simplify and to help implementations or to facilitate the emergence of different use cases. Each of this cases is exposed below:

  • They should be compressed. Why we decided to keep this out? Compression is better handled in the transport or storage layer, not the inner layer. It makes things much easier to reason about (such as seek within a file in the filesystem with a O(1) cost). Similarly to WebAssembly, where the format is not compressed per se.

3Binary Format

We have designed the binary based on the technical requirements summarized in the Design section.

3.1Encoding

The encoding of a WebContainer starts with a preamble containing a 5-byte Magic number (the string ∖0webc) and a Version field. The current version of the WebContainer binary format is 1.

The preamble is followed by a manifest.

Magic
\0webc

3.1.1Manifest

The manifest file describes the metadata for the WebContainer, stored in a Concice Binary Object Representation (CBOR), see RFC8949 to learn more.

Here’s an example of what the CBOR contents can be represented as a JSON:

Example № 1{
  "modules": [
    {
      "id": "python",
      "integrity": "sha256-2332341234",
      "runner": "webcontainer.org/runner/wasi/command@unstable_",
      "interfaces": {
        "webcontainer.org/interface/wasi@0.1"
      },
      "annotations": {
        "wasi.ENV": [
          "PYTHON_HOME=abc",
        ],
        "wasi.ROOT_FS": "",
      },
      "volumes": {
        "lib": {
          "kind": "local",
          "index": 0
        }
      }
    }
  ],
  "permissions": [
    "filesystem",
    "networking",
    "time",
  ],
  "commands": {
    "python": {
      "module": {
        "kind": "local",
        "index": 0,
      },
      "annotations": {
        "wasi.runner.INHERIT_ARGS": "true", // it's true by default
      }
    },
    "pip": {
      "module": {
        "kind": "local",
        "index": 0,
      },
      "annotations": {
        "wasi.ARGS": "-m pip",
        "wasi.INHERIT_ARGS": "true", // it's true by default
      }
    }
  },
  "entrypoint": "runserver"
}

3.1.2VolumeIndex

The VolumeIndex

3.1.3Volume

3.1.4Other Binary Tokens

3.1.4.1Byte

Byte
/[0-9a-fA-F]/

3.1.4.2Unsigned 64-bit Number

U64
Byte{8}

3.1.4.3Unsigned LEB128

A number encoded as an unsigned LEB128

3.1.4.4Binary Content

The Binary content has the UnsignedLEB128 size following with a binary content with the size determinted by the LEB128.

3.1.5CBOR

CBOR is a binary format equivalent to JSON, with a more optimal serialization. https://www.rfc-editor.org/rfc/rfc8949.html

4Prior Art

The WebContainer design was based, inspired and iterated from multiple projects that took great steps towards containerization and permision-based sandboxed execution.

4.1W3C Subresource Integrity

The Subresource Integrity specification defines a mechanism by which user agents may verify that a fetched resource has been delivered without unexpected manipulation.

4.2Deno

Deno is the next iteration to Node.js that allows a typed usage of a scripting language in a performant way. Deno improved their dependencies to be defined by http instead than a strict registry and integrated with a strict security mechanism.

4.3OCI (Docker)

Open Container Initiative (OCI) is the standard created by Docker to create containers. It enabled an extra layer of abstraction so more runtimes could execute this standard other than the Docker runtime.

4.4Facebook XAR

Facebook created the XAR format in order to run Python distributions easily across their datacenters.

WebContainers got inspired from XAR in their simplicity and usability. However XAR depends on SquashFS and this one is only available in Linux and macOS. We needed something cross-platform and portable to the web.

4.5Emscripten

Emscripten has shipped support for lz4 filesystem since v1.34.8 (9/9/2015).

WebContainers got inspired by their usage of lz4 with an index to allow fast-random access to files in the filesystem without having to download the whole archive.

4.6WebPackages

WebPackages are a new proposal to W3C that aim to standardize the way that browsers save content to be used offline. As such, they optimize the spec around Requests and Responses to those requests.

4.7JARs

Java ARchives (JAR) are platform-independent file that aggregates multiple files into one.

JAR File Specification

4.8Canonical Snap

Snaps are a great way of publishing and installing apps on Ubuntu. They differ from WebContainers in that they are platform and chipset agnostic.

4.9Chrome Extensions

Google Chrome created one of the best experiences for defining permissions of what WebExtensions are capable to access and do.

> An extension can declare both required and optional permissions. In general, > you should: Use required permissions when they are needed for your extension’s > basic functionality. Use optional permissions when they are needed for > optional features in your extension.

chrome.permissions

4.10Android permissions

> App permissions help support user privacy by protecting access to the > following: Restricted data, such as system state and a user’s contact > information. Restricted actions, such as connecting to a paired device and > recording audio.

Permissions on Android

4.11iOS Entitlements

> An iOS entitlement is a right or privilege that grants an executable > particular capabilities. For example, an app needs the HomeKit Entitlement — > along with explicit user consent — to access a user’s home automation network. > An app stores its entitlements as key-value pairs embedded in the code > signature of its binary executable.

Bundle Entitlements

AAppendix: FAQ

A.1Q: Why not include the Filesystem inside the WebAssembly file as a Custom Section?

There are two main reasons:

  1. Reusability of volumes across different WebAssembly Modules
  2. Caching mechanisms will be invalidated after changing a module, even if this module remains statically the same

Storing the binary information inside a custom section in one WebAssembly Module will make it unaccessible from other module. This can be solved by having the filesystem stored in each module, but it will cause much bloater sizes for the modules and therefore for the WebContainer overall.

That’s why an abstraction on top of the Wasm binary format is necessary in order to achieve optimal containers via WebContainers.

A.2Q: Why not using the OCI specification?

OCI is a great specification for images and runtimes of those images (runc).

However it was designed with server-side tradeoffs in mind which cause browser-side usage non-optimal:

  • Unpacking an OCI image is required to run a container
  • Only one filesystem volume per container
  • It doesn’t have an index of filesystem in the distribution format (critical for random file access)
  • The execution environment is defined per container, and not per module (which makes a whole lot of sense when the container is based in a specific Operating System)
  • It has platform specific configuration

§Index

  1. BinaryContent
  2. Byte
  3. CBOR
  4. FileContentPair
  5. Magic
  6. Manifest
  7. PathOffset
  8. PathWithOffset
  9. U64
  10. UnsignedLEB128
  11. Version
  12. Volume
  13. VolumeIndex
  14. WebContainer
  1. 1Overview
  2. 2WebContainer Design
    1. 2.1Why we need WebContainers?
    2. 2.2Technical Requirements
  3. 3Binary Format
    1. 3.1Encoding
      1. 3.1.1Manifest
      2. 3.1.2VolumeIndex
      3. 3.1.3Volume
      4. 3.1.4Other Binary Tokens
        1. 3.1.4.1Byte
        2. 3.1.4.2Unsigned 64-bit Number
        3. 3.1.4.3Unsigned LEB128
        4. 3.1.4.4Binary Content
      5. 3.1.5CBOR
  4. 4Prior Art
    1. 4.1W3C Subresource Integrity
    2. 4.2Deno
    3. 4.3OCI (Docker)
    4. 4.4Facebook XAR
    5. 4.5Emscripten
    6. 4.6WebPackages
    7. 4.7JARs
    8. 4.8Canonical Snap
    9. 4.9Chrome Extensions
    10. 4.10Android permissions
    11. 4.11iOS Entitlements
  5. AAppendix: FAQ
    1. A.1Q: Why not include the Filesystem inside the WebAssembly file as a Custom Section?
    2. A.2Q: Why not using the OCI specification?
  6. §Index