Skip to content

Conversation

@Robbepop
Copy link
Member

@Robbepop Robbepop commented Sep 10, 2025

Wasmi IR 3.0 definitions: #1650

Potential optimizations:

  • Use same ip in execution handlers as in op-decode. (🟢 -11% fibonacci/iter)
  • Integrate instance stack in CallStack into frames. (easy)
  • Properly use extra-checks for technically-infallible conditional panics. (easy)
  • Use byte-offsets for Slot type to avoid mul-by-8 during execution. (easy)
  • Return TrapCode::OutOfSystemMemory in Stack operations where necessary.
  • Redesign trampoline based operators dispatch to avoid state copy. (medium)
  • Use extern "sysv64" ABI for execution handlers on Windows. (medium)
  • Use variable sized cell load/stores for 32-bit operands. (follow-up PR)
  • Make Wasmi IR Instance related instead of Module releated. (follow-up PR)

Development Notes

  • Interpretation loop must be panic-free for tail calls to be properly generated.
    • This means that we have to wrap host function calls with ::core::panic::catch_unwind and convert potential panics into flags signalling bad state to the interpreter.
    • Furthermore, we need to establish a way to signal to the interpretation loop that we are in an error state (e.g. encountered a bug) and have to exit the interpretation loop safely before resolving the bug, e.g. via panicking.

ToDo

Fix known bugs:

  • Fix bugs with branch operator execution.
    • Likely, the branch offsets are incorrect.
  • Fix integer-overflow in Wasmi translator.
  • Fix execution issue with recursive operation sequences.
  • Fix bug in stripping start and end of copy_span values in translator.
  • Fix bugs leading to incorrect reverse_complement result.
  • Fix bugs leading to incorrect regex_redux result.

Implement Functionality

  • Calling host functions from root.
  • Calling host functions from within Wasm functions.
    • call_imported
    • call_indirect
    • return_call_imported
    • return_call_indirect
  • Calling and resuming of resumable functions.
  • Missing Wasmi IR execution handlers
    • br_table operators
    • return_call* operators
    • bulk memory operators
    • bulk table operators
    • simd operators
    • relaxed-simd operators
    • wide-arithmetic operators

Preliminary Benchmarks

Preliminary benchmarks of this WIP state pre-optimization shows interesting results:

execute/fibonacci/iter

Crate Features PR main %
578us 705us 🟢 -18%
compact 840us - 🔴 +19%
trampolines 1.92ms - 🔴 x2.7
compact+trampolines 1.8ms - 🔴 x2.6

execute/fibonacci/rec

Crate Features PR main %
2.84ms 3.77ms 🟢 -24%
compact 4.10ms - 🔴 +8%
trampolines 5.32ms - 🔴 +41%
compact+trampolines 6.3ms - 🔴 +67%

execute/fibonacci/tail

Crate Features PR main %
481us 777us 🟢 -38%
compact 728us - 🟢 -7%
trampolines 1.04ms - 🔴 +34%
compact+trampolines 1.26ms - 🔴 +62%

execute/tiny_keccak

Crate Features PR main %
114us 193ms 🟢 -41%
compact 183us - 🟢 -4%
trampolines 421ms - 🔴 x2.18
compact+trampolines 360ms - 🔴 x1.87

Benchmarks Conclusion

  • The tail-call-based direct-dispatch outperforms main significantly.
  • The tail-call-based indirect-dispatch is somewhat on par with main.
  • Both trampoline-based dispatches are wastly outperformed by main.
    • Note: the reason partly is really bad codegen due to unnecessary register copies which should be fixable.

@Robbepop Robbepop marked this pull request as draft September 10, 2025 08:21
@Robbepop Robbepop mentioned this pull request Sep 11, 2025
@Robbepop Robbepop force-pushed the rf-integrate-wasmi-ir-3.0 branch 7 times, most recently from 41aad7c to bb8b966 Compare September 18, 2025 10:11
@Robbepop Robbepop force-pushed the rf-integrate-wasmi-ir-3.0 branch 2 times, most recently from 12a19d4 to 64f420e Compare September 19, 2025 14:54
@Robbepop Robbepop force-pushed the rf-integrate-wasmi-ir-3.0 branch 2 times, most recently from 48242ca to 4d2d316 Compare October 4, 2025 10:55
@Robbepop Robbepop force-pushed the rf-integrate-wasmi-ir-3.0 branch from 7bfb899 to c32c7c7 Compare October 24, 2025 13:52
@Robbepop Robbepop force-pushed the rf-integrate-wasmi-ir-3.0 branch 3 times, most recently from 7b54b34 to 7619847 Compare November 1, 2025 14:59
This state either signals that there is an external reason for the break or that there was a trap with a specific trap code.
This makes use of TrapCode encoding within the Break type which allows for better modularity within execution handler logic.
This is going to replace the IntoTrapResult utility trait.
Benchmarks show 2-3% performance increase on the tiny_keccak case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants