JetStream 3.0: Redefining Browser Performance Benchmarks for the Modern Web

Introduction

In a collaborative announcement with Google and Mozilla, the WebKit team unveiled JetStream 3.0, a major update to the widely used cross-browser benchmark suite. This new version not only refreshes the test workloads but also fundamentally changes how browser engines measure performance—especially for WebAssembly and large-scale modern web applications. While the joint announcement highlights the suite-wide improvements, this article delves into the specific challenges faced by the WebKit team and the engineering innovations in JavaScriptCore that drove the update.

JetStream 3.0: Redefining Browser Performance Benchmarks for the Modern Web — Source: webkit.org

The Need for a New Benchmark

Benchmarks are essential tools for browser engine developers to drive performance improvements. However, the web evolves rapidly, and any benchmark risks becoming outdated as new best practices emerge. Once the low-hanging optimizations are addressed, subsequent improvements tend to become increasingly narrow and specific to the benchmark workload. JetStream 3 addresses this by providing a refresh and a paradigm shift in performance measurement, particularly for WebAssembly (Wasm) and the scaling demands of modern web applications.

The Evolution of WebAssembly Benchmarking

One of the most significant changes in JetStream 3 is how it measures WebAssembly workloads. To understand the rationale, we must look back at the early days of Wasm. When JetStream 2 launched, WebAssembly was still in its infancy. Early adopters typically involved large C/C++ projects that had previously compiled to asm.js. These applications, such as video games, often required a long, one-time startup cost in exchange for high-throughput performance afterward. Consequently, JetStream 2 scored Wasm in two distinct phases: Startup and Runtime.

The Zero-Time Problem

Over the years, browser engines became remarkably efficient at instantiating WebAssembly modules. As startup times improved, even micro-optimizations began to compound in significance. Shaving 0.1 ms off a 100 ms workload is negligible, but once engines reduced instantiation time to just 2 ms, the same 0.1 ms improvement represented a 5% gain. In WebKit, for example, the startup path was optimized so aggressively that for smaller workloads, startup time effectively reached zero seconds.

An Infinity Score Anomaly

JetStream 2 computed each iteration’s time using Date.now(), which rounds to the nearest millisecond. When startup time dropped below 1 ms, it registered as 0 ms. The scoring formula was Score = 5000 / Time, so a time of zero yielded an infinite score. The benchmark had to be patched in JetStream 2.2 by capping the score at 5000 to prevent this anomaly from distorting overall results. While an “infinite” score might seem like a victory, it indicated that browser engines had outgrown the Wasm subtests.

Adapting to the Modern WebAssembly Landscape

Today, WebAssembly is no longer limited to niche, startup-heavy applications. It appears in the critical path for many page loads—used in libraries, image decoders, UI frameworks, and more. A “zero” startup time in a microbenchmark does not reflect how Wasm interacts with real-world page loads, where the overhead of module instantiation is just one part of a larger performance picture. JetStream 3 replaces the old Startup/Runtime split with a more holistic approach that measures overall impact on page responsiveness, not just isolated phases.

New Wasm Workloads and Scoring

The suite introduces new Wasm tests that simulate realistic usage patterns, such as repeatedly instantiating small modules (mimicking library usage) and longer-running compute tasks (like video processing). The scoring system now accounts for both efficiency and consistency, avoiding the zero-time trap by using higher-resolution timers and averaging across multiple runs.

Broader Changes in JetStream 3

Beyond WebAssembly, JetStream 3 updates JavaScript workloads to reflect modern coding practices, including heavy use of promises, async/await, and Web APIs. Tests cover areas like DOM manipulation, network request handling, and data processing typical of contemporary web apps. The benchmark suite also places greater emphasis on real-world scalability, measuring performance under conditions that mimic complex, multi-page applications rather than isolated snippets.

Engineering Improvements in JavaScriptCore

The WebKit team made several key enhancements in JavaScriptCore to shine in JetStream 3. These include optimized WebAssembly compilation pipelines, faster module instantiation, and improved tiering strategies that balance startup speed with runtime throughput. For instance, the engine now uses a dual-compilation approach: a quick baseline compilation for rapid startup, followed by a more optimized compilation in the background for long-running code. This ensures that even small modules benefit from near-instant readiness without sacrificing peak performance.

Conclusion

JetStream 3.0 marks a pivotal moment in browser performance benchmarking. By addressing the limitations of its predecessor—especially the zero-time anomaly and outdated Wasm assumptions—the suite provides a more accurate and useful measure of real-world performance. The collaboration between Apple, Google, and Mozilla ensures that the benchmark remains relevant across all major engines. For developers, JetStream 3 offers a clearer insight into how browsers handle today’s demanding web applications, driving optimizations that ultimately benefit users worldwide.

Explore the full JetStream 3 suite and its documentation on the official website.