Boosting WebAssembly Performance with Speculative Inlining and Deoptimization in V8

Introduction

Modern web applications demand ever-increasing performance, and WebAssembly (Wasm) has become a cornerstone for running computationally intensive code in the browser. The V8 JavaScript engine, used by Google Chrome, recently introduced two powerful optimizations for WebAssembly: speculative call_indirect inlining and deoptimization support. These techniques, shipping with Chrome M137, enable the generation of more efficient machine code by leveraging runtime feedback. The result is a significant speed boost, especially for WasmGC (Garbage Collection) programs—with Dart microbenchmarks showing an average improvement of over 50% and larger applications gaining 1–8% faster execution. Moreover, deoptimization lays the groundwork for future enhancements.

Boosting WebAssembly Performance with Speculative Inlining and Deoptimization in V8 — Source: v8.dev

Background: Why Speculative Optimizations Matter

To understand these changes, it helps to look at how JavaScript engines achieve fast execution. V8 relies heavily on speculative optimizations: just-in-time (JIT) compilers make assumptions based on feedback collected from previous runs. For instance, when encountering a + b, if past data shows both are integers, the compiler generates optimized integer addition code instead of the slower generic code needed to handle strings, floats, or objects. If the program later violates those assumptions, V8 performs a deoptimization (or “deopt”)—discarding the optimized code and falling back to unoptimized execution while gathering more feedback for future tiering up.

Traditionally, WebAssembly did not require such speculative techniques. Wasm 1.0 programs—often compiled from C, C++, or Rust—already had rich static type information available. Their binaries could be highly optimized ahead of time by toolchains like Emscripten (based on LLVM) or Binaryen. As a result, V8 could generate fast code without needing runtime feedback or deopts.

Motivation: The Rise of WasmGC

Why introduce speculative optimizations for WebAssembly now? The key driver is the WebAssembly Garbage Collection (WasmGC) proposal, which extends WebAssembly to better support managed languages such as Java, Kotlin, and Dart. WasmGC bytecode operates at a higher abstraction level than Wasm 1.0: it includes rich types like structs and arrays, subtyping, and type operations. This higher-level information makes inlining—replacing a function call with the function’s body—especially beneficial. However, static analysis alone cannot always predict which function will be called at a given site, particularly with polymorphic dispatch. Speculative inlining, guided by runtime feedback, can choose the most likely target and invalidate that assumption if needed, via deoptimization.

Technical Implementation: Speculative Inlining and Deoptimization

Speculative call_indirect Inlining

In WebAssembly, indirect calls use the call_indirect instruction, which dispatches to a function based on a table index. Without speculation, the compiler must generate a lookup and indirect jump, which is expensive. With runtime feedback, V8 can observe which function is most frequently called at a given call_indirect site. If a single target dominates—say, 95% of the time—the compiler speculatively inlines that function directly. This eliminates the overhead of the indirect call and enables further optimization of the inlined code.

Deoptimization Support

If the program later calls a different function at that site, the assumption is violated. V8’s new deoptimization mechanism detects this and transitions back to unoptimized code, discarding the speculative inline. The engine then collects fresh feedback and may re-optimize later with updated assumptions. This safety net ensures that speculative optimizations do not break correctness while still delivering speed for common cases.

Impact and Benchmarks

The combination of speculative inlining and deoptimization yields impressive results. On a suite of Dart microbenchmarks (transpiled to WasmGC), the speedup averages over 50%. For larger, realistic applications—such as those built with Flutter or other frameworks—the improvements range from 1% to 8%. These gains come from reduced indirect call overhead and better inlining decisions. Furthermore, deoptimization is a foundational building block that will enable even more aggressive optimizations in future V8 releases.

Note: These optimizations are especially effective for WasmGC programs, where high-level type information makes speculative inlining more lucrative. For traditional Wasm 1.0 code, the benefits are smaller because static optimization already covers many cases.

Future Directions

Support for deoptimizations opens the door to a host of additional optimizations. V8 can now speculate on value ranges, null checks, array bounds, and more—and recover safely if those speculations fail. This brings WebAssembly execution closer to the dynamism that JavaScript JIT compilers enjoy. As WebAssembly continues to evolve—with proposals like Exception Handling and Stack Switching—speculative techniques will become even more critical.

For developers targeting WebAssembly, especially with managed languages, these improvements mean faster startup and smoother runtime performance without any code changes. The work in V8 demonstrates that combining static and dynamic optimization can push WebAssembly to new heights.

To learn more about the technical details, refer to the V8 blog or the original announcement.

Tags: