Skip to content
GitHub Get Started
Reference

Benchmarks

These numbers measure the public secure-exec SDK paths that consumers actually use. The cold start matrix (packages/benchmarks/coldstart.bench.ts) times the full journey from booting a runtime to running guest code, and compares three ways of provisioning a runtime against how much you can reuse.

  • owned-sidecar: NodeRuntime.create() boots a fresh sidecar process for each runtime. This is the default, fully isolated path (see Process Isolation).
  • shared-sidecar: one sidecar process is created once and reused across many runtimes, instead of spawning a fresh sidecar per runtime. Sidecar setup is measured separately and excluded from cold start, so this isolates the per-runtime cost. (Sharing a sidecar is a reuse fast path: spawn one Sidecar and pass it to NodeRuntime.create({ sidecar }), instead of the default NodeRuntime.create() path that owns a fresh sidecar per runtime.)
  • resident-runner: a shared sidecar plus a resident runner, so repeated small snippets reuse one live guest Node process instead of starting a new one each time.

Each scenario reports two latencies:

  • Cold: provisioning the runtime and running the first guest snippet end to end.
  • Warm: running a second snippet on the same runtime, after it is already up.

A single sequential runtime (batch size 1) gives the cleanest picture:

ScenarioCold meanCold p50Warm meanWarm p50
owned-sidecar772.88ms771.96ms452.68ms452.37ms
shared-sidecar774.17ms774.50ms457.19ms452.75ms
resident-runner351.00ms349.56ms1.27ms1.27ms

The headline result is the resident runner: once the live guest process exists, a warm snippet runs in about 1.3ms, roughly 350x faster than a warm execution that still has to stand up a guest process. Owned and shared sidecars cost about the same per runtime when run sequentially, because the sidecar process itself is cheap to spawn (around 3ms); sharing it mainly matters under concurrency.

Breaking down a single owned-sidecar cold start (batch 1, sequential, p50 per phase):

Phasep50
sidecar_spawn2.78ms
session_open2.34ms
vm_create1.36ms
vm_configure0.17ms
runtime_mount_node2.71ms
runtime_mount_wasm172.95ms
runtime_create_total176.30ms
first_exec596.26ms
warm_exec452.37ms

Two phases dominate: mounting the WASM command set (about 173ms) and the first guest execution (about 596ms, which includes bringing up the guest Node runtime). Spawning the sidecar, opening a session, and creating the VM together cost only a few milliseconds. This is why the resident runner wins so decisively: it pays the first-exec cost once and then reuses the live process.

The matrix also runs each scenario at larger batch sizes, both sequentially and concurrently (up to the host concurrency cap). Cold mean latency by batch size:

ScenarioModeb=1b=10b=50b=100b=200
owned-sidecarsequential772.88ms780.07ms777.49ms777.89ms778.28ms
owned-sidecarconcurrent776.22ms1000.39ms1041.81ms1041.19ms1044.03ms
shared-sidecarsequential774.17ms627.58ms610.76ms619.24ms616.20ms
shared-sidecarconcurrent775.66ms3493.51ms3197.96ms3187.99ms3235.65ms
resident-runnersequential351.00ms202.61ms187.92ms186.10ms189.76ms
resident-runnerconcurrent347.91ms205.00ms187.45ms187.07ms191.33ms

Takeaways:

  • owned-sidecar holds flat when sequential. Run concurrently, per-runtime cold time rises to about 1040ms as many runtimes mount their WASM command sets at once and contend for CPU. Each runtime still has its own sidecar, so they make progress in parallel.
  • shared-sidecar is efficient sequentially (around 610ms once the shared sidecar is warm) but degrades sharply under concurrency (3000ms or more), because one sidecar process serializes the heavy concurrent mount and first-exec work. Share a sidecar when work arrives sequentially or at low concurrency, not for bursty parallel fan-out.
  • resident-runner improves with batch size as the one-time runner creation amortizes, settling around 187ms cold and about 1.3ms warm regardless of mode.
  • Need strong isolation and unpredictable, bursty load: use owned-sidecar (the default). Each runtime is its own crash and resource domain.
  • Running many short tasks back to back where a shared failure domain is acceptable: a shared-sidecar amortizes setup.
  • Running the same kind of small snippet over and over (for example an evaluation loop): a resident-runner turns a roughly 450ms warm execution into a roughly 1.3ms one.

See Process Isolation for what each of these shares and isolates.

CPU: 12th Gen Intel(R) Core(TM) i7-12700KF
Cores: 20 | Max concurrency: 16 | Max live runtimes: 8
RAM: 62.6 GB | Node: v24.13.0
Iterations: 5 (+ 1 warmup)
Batch sizes: 1, 10, 50, 100, 200
Scenarios: owned-sidecar, shared-sidecar, resident-runner
Sidecar: release build of secure-exec-sidecar

Cold start is wall time from the start of runtime provisioning through the first guest execution. Warm is a second execution on the already-running runtime. For shared-sidecar, the one-time sidecar setup (around 4ms to 5ms) is measured separately and excluded from the per-runtime cold number. Phase timings (sidecar_spawn, session_open, vm_create, runtime_mount_wasm, first_exec, and the resident-runner phases) are recorded in the machine-readable JSON output.

Build a release sidecar first for meaningful timings:

Terminal window
cargo build --release -p secure-exec-sidecar

Run the full suite:

Terminal window
pnpm --dir packages/benchmarks bench

Run only the cold start matrix:

Terminal window
SECURE_EXEC_SIDECAR_BIN="$PWD/target/release/secure-exec-sidecar" \
pnpm --silent --dir packages/benchmarks bench:coldstart \
> packages/benchmarks/results/coldstart-local.json \
2> packages/benchmarks/results/coldstart-local.log

A quick smoke run keeps it to one iteration:

Terminal window
BENCH_BATCH_SIZES=1 \
BENCH_ITERATIONS=1 \
BENCH_WARMUP=0 \
BENCH_SCENARIOS=owned-sidecar,shared-sidecar,resident-runner \
SECURE_EXEC_SIDECAR_BIN="$PWD/target/release/secure-exec-sidecar" \
pnpm --silent --dir packages/benchmarks bench:coldstart