Skip to content

Conversation

@cfallin
Copy link
Member

@cfallin cfallin commented Jul 25, 2025

This PR introduces support for the Wasm exception-handling proposal, which introduces a conventional try/catch mechanism to WebAssembly. The PR supports modules that use try_table to register handlers for a dynamic scope; and provides throw and throw_ref that allocate (in the first case) and throw exception objects.

This PR builds on top of the work in #10510 for Cranelift-level exception support, #10919 for an unwinder, and #11230 for exception objects built on top of GC, in addition a bunch of smaller fix and enabling PRs around those.

@cfallin
Copy link
Member Author

cfallin commented Jul 25, 2025

Logistical note: I'm posting this as a draft now to get early feedback and because I know folks are waiting to see how it is shaping up. I'm on vacation for two weeks starting now (back Mon Aug 11) and will plan to polish then. I'm hoping to actually get host-boundary integration built as well, if I can, in this PR, to enable spec-tests, but if that turns out to be too much then it will come right after. Following that, fuzzing is the only piece that remains, I think.

@cfallin cfallin force-pushed the wasm-exceptions branch 6 times, most recently from 88e7b7f to 033989a Compare July 26, 2025 02:05
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. labels Jul 26, 2025
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some high-level thoughts here and there, but definitely feel free to defer anything to issues as you feel appropriate.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shaping up nicely!

We had talked elsewhere about removing the exception composite type variant and having tags refer to just their function type (but we would now optionally have a GC struct layout for a function type's parameters for use with exceptions). This would better align us with the Wasm spec and make it so that there are less new additions to the types registry code and also fewer interactions at runtime with its locking and tables and all that. Are you still planning on pursuing this?

@cfallin
Copy link
Member Author

cfallin commented Jul 29, 2025

We had talked elsewhere about removing the exception composite type variant and having tags refer to just their function type (but we would now optionally have a GC struct layout for a function type's parameters for use with exceptions). This would better align us with the Wasm spec and make it so that there are less new additions to the types registry code and also fewer interactions at runtime with its locking and tables and all that. Are you still planning on pursuing this?

Ah, sorry, I hadn't made a note in the PR message here, but: I tried and abandoned that path. (Or more precisely, having exception layouts hang off of the function type; TagType is already a thin newtype wrapper around FuncType.) At a high level, pulling on that string seemed to unwind way too much structure and hit too many places that really still wanted a concrete type for an exception object. If you're curious, my WIP branch is here (still has many type-errors, mid-refactor). In essence I think that path leads to more complexity, not less, unfortunately.

also fewer interactions at runtime with its locking and tables and all that

The current implementation performs no locking or accesses to the type registry at runtime; it uses the dynamic context mechanism in Cranelift to get straight to the instance, then look up tags (VMTagDefinitions), and compares tag IDs (instance-id/defined-tag-index). In particular, the tag information is at a constant offset in the exception object (similar to array lengths) so we don't need a layout to write the generic path.

(Still on PTO but will respond occasionally to keep review moving)

@cfallin cfallin force-pushed the wasm-exceptions branch 2 times, most recently from 44283d9 to d592c51 Compare August 12, 2025 01:16
@github-actions github-actions bot added the fuzzing Issues related to our fuzzing infrastructure label Aug 12, 2025
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@cfallin
Copy link
Member Author

cfallin commented Aug 15, 2025

@alexcrichton @fitzgen I've now done the refactor we discussed and also added support for Wasm-to-host, host-to-Wasm, and host-to-host (via Func::new -> Func::call) exception throws. The remaining bit to address is Pulley support (I suspect this should be relatively straightforward now?) and adding assert_exception to our WAST runner then enabling spec-tests. I believe the core should be ready for another pass.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Aug 15, 2025
This commit does some preparatory refactoring for bytecodealliance#11326 to ensure that
a store is available when trap information is being processed. Currently
this doesn't leverage the new parameter but it should be leverage-able
in bytecodealliance#11326.
@cfallin cfallin force-pushed the wasm-exceptions branch 2 times, most recently from 05725e5 to 6275df6 Compare August 15, 2025 22:26
@cfallin
Copy link
Member Author

cfallin commented Aug 15, 2025

I've refactored on top of #11441 and this is indeed much nicer; I've removed the From impls that would otherwise pass through boxed exceptions from hostcalls without converting to a pending exception, so I'm more confident we won't miss any additional hostcall paths now.

The current challenge is dealing with rooting: the factor has placed the point at which we capture the Rooted<ExnRef> (refactor to wrap that up in a wasmtime::Exception TBD) outside the root handle scope for the hostcall, so any newly allocated exception from the host becomes unrooted. Pulling this string further may require additional refactors to put the LIFO-scope management in common across all hostcalls; just starting to look into this. More thoughts (or a pairing session if that'd be easier) welcome, @alexcrichton...

@alexcrichton
Copy link
Member

Oh good point! I think it'd be reasonable to move this management into the trap-handling bits (or around the enter_host_from_wasm bits). That future-proofs this for component-model-gc support and additionally enables eventually cleaning up libcalls that currently create their own scopes as they no longer need to

@cfallin
Copy link
Member Author

cfallin commented Aug 15, 2025

Actually, I think there are deeper issues with rooting, exceptions, and Result / ? ergonomics that we didn't really address in the RFC, and I'm finding I'm bumping into these design questions now. The current state of this PR has working-but-for-unrooted-ref-panics examples of all directions of exception transfer in cargo test --test all -- exceptions, demonstrating a few examples of the issue.

The basic question is: if we return an exception as a GC object boxed in a Rooted<ExnRef> (eventually inside a nicely-packaged wasmtime::Exception type) in the Err arm of a Result, how should we expect this to interact with handle scopes?

The most basic example is a host function that allocates an ExnRef and returns Err(exnref.into()), and has no other scopes of its own. We can fix up this PR by moving the root scope teardown outside of the point that we root a pending exception on the store, so that can work fine.

But in the general case, with user host code creating nested RootScopes, we cannot really expect automatic ? passing-up-the-Err to work well. Consider:

fn my_hostcall(...) -> Result<()> {
  let scope = RootScope::new(&mut caller);
  helper(&mut scope)?;
  Ok(())
}

fn helper(ctx: impl AsContextMut<'_>, ...) -> Result<()> {
  return Err(ExnRef::new(...).into());
}

This will typecheck fine (ExnRef::new() returns a Rooted<ExnRef>, we currently take that as an Error impl directly, we can wrap it up in wasmtime::Exception once I do that); but the exception object will be unrooted once the ? propagates it past the user's own RootScope.

More generally, Err-carried types should generally expect to own their error information. Our lack of lifetimes to denote valid scope is an explicit ergonomic choice, which is great in general, but hides the fact that we will get a dynamic scope-violation (unrooting) error here.

Should we use a ManuallyRooted instead? I think that's much worse -- it's way too easy to leak, since its Drop impl doesn't unroot (because it can't, because it doesn't have a mut borrow of the Store -- again a totally valid choice that is nice in 99% of cases).

It's somewhat tempting to say "don't do that" to all of this -- it's no different than returning any GC ref type at all. If we take this option, we just fix the exact extent of the root scope when calling out to the host, we root any returned exception when it comes back (concretely: put the HostResult conversion inside the root scope somehow, probably by pulling the scope out to catch_and_record_traps). My broader concern here is that I'm not sure that it has the nice error-propagation properties that we expected. Basically, it's a nonlocal/non-composable footgun: anyone creating a scope anywhere inside complex host logic could cause an exception created on deep to blow up dynamically with a panic.

One alternative is to have an explicit store.throw(exn) as part of the public API, and then provide a public-facing tombstone type that the user can propagate upward through Results. This does make things a little weird in other ways, because it's implicit state (we could have an API around "cancel pending exception" too if someone wants to "catch" it in host code), but it feels the least footgunny in a lot of others. Basically it's extending what we decided to do internally (root the exception on the store, safely) to user code.

The last option does differ from what we agreed to in the RFC (but the RFC is underspecified with respect to rooting behavior in general, on the other hand) -- so I'd want some consensus here, at least, before building that. Any thoughts?

@alexcrichton
Copy link
Member

Good points!

Basically, it's a nonlocal/non-composable footgun

To some degree this is inherently true of Rooted and ManuallyRooted. As you point out there's big downsides to using both. That being said I do agree this is particularly exacerbated due to the nature of "just propagate the error upwards please" which makes it more likely an exception might fly all the way up past whatever scopes are in use.

One alternative is to have an explicit store.throw(exn) as part of the public API

I like this approach personally. What I might recommend is something like StoreContextMut::throw(&mut self, &Rooted<ExnRef>) -> wasmtime::ThrownException so that way the user doesn't have to create the tombstone themselves and can immediately return that error upwards the stack.

@cfallin
Copy link
Member Author

cfallin commented Aug 16, 2025

wasmtime::ThrownException so that way the user doesn't have to create the tombstone themselves and can immediately return that error upwards the stack.

Indeed, that was my thought as well. I suppose one could actually also have a signature throw(...) -> Result<Uninhabited, ThrownException> so that one could do store.throw(exn)?; -- or both, thrown_exception(...) -> ThrownException and throw(...) -- but at this point I'm bikeshedding a bit. I'll prototype this to see how it feels.

@cfallin
Copy link
Member Author

cfallin commented Aug 20, 2025

Ah, sorry about that! Bad habit trying to keep a clean commit history. Last diff was

diff --git a/crates/wasmtime/src/runtime/vm/libcalls.rs b/crates/wasmtime/src/runtime/vm/libcalls.rs
index de80727fa9..19a18af8ea 100644
--- a/crates/wasmtime/src/runtime/vm/libcalls.rs
+++ b/crates/wasmtime/src/runtime/vm/libcalls.rs
@@ -1612,7 +1612,10 @@ fn raise(store: &mut dyn VMStore, _instance: Pin<&mut Instance>) {
     // When Cranelift isn't in use then this is an unused libcall for Pulley, so
     // just insert a stub to catch bugs if it's accidentally called.
     #[cfg(not(has_host_compiler_backend))]
-    unreachable!()
+    {
+        let _ = store;
+        unreachable!()
+    }
 }

 // Builtins for continuations. These are thin wrappers around the
diff --git a/crates/wasmtime/src/runtime/vm/throw.rs b/crates/wasmtime/src/runtime/vm/throw.rs
index 2cf28c5caa..a6cf333a94 100644
--- a/crates/wasmtime/src/runtime/vm/throw.rs
+++ b/crates/wasmtime/src/runtime/vm/throw.rs
@@ -115,9 +115,10 @@ pub unsafe fn compute_throw_action(store: &mut dyn VMStore) -> ThrowAction {
         }
         None
     };
+    let unwinder = nogc.unwinder();
     let action = unsafe {
         wasmtime_unwinder::compute_throw_action(
-            &wasmtime_unwinder::UnwindHost,
+            unwinder,
             handler_lookup,
             exit_pc,
             exit_trampoline_fp,

cfallin and others added 7 commits August 20, 2025 15:10
* Add a check to `supports_host` for the generated test and assert
  failure also when that is false.
* Remove `pulley_unsupported` test as it falls out of `#[wasmtime_test]`
* Remove `exceptions_store` helper as it falls out of `#[wasmtime_test]`
* Remove miri annotations as they fall out of `#[wasmtime_test]`
If the selected compiler doesn't support the host at all then there's no
need to run it. Actually running it could misinterpret `CraneliftNative`
as "run with pulley" otherwise, so avoid such false negatives.
@alexcrichton
Copy link
Member

While there may be more than one segfaulting test I can reproduce exceptions::craneliftnative_basic_throw failing with this stack trace:

#0  0x000002aa022684a8 in wasmtime_environ::module::Module::defined_tag_index () at crates/environ/src/module.rs:528
#1  0x000002aa022768a0 in wasmtime::runtime::vm::instance::Instance::get_exported_tag () at crates/wasmtime/src/runtime/vm/instance.rs:676
#2  0x000002aa020592fe in wasmtime::runtime::vm::throw::compute_throw_action::{closure#0} () at crates/wasmtime/src/runtime/vm/throw.rs:99
#3  0x000002aa02058a6e in wasmtime_internal_unwinder::throw::compute_throw_action::{closure#0}<wasmtime::runtime::vm::throw::compute_throw_action::{closure_env#0}> () at crates/unwinder/src/throw.rs:76
#4  0x000002aa0236f44c in wasmtime_internal_unwinder::stackwalk::visit_frames<wasmtime_internal_unwinder::throw::ThrowAction, wasmtime_internal_unwinder::throw::compute_throw_action::{closure_env#0}<wasmtime::runtime::vm::throw::compute_throw_action::{closure_env#0}>> () at crates/unwinder/src/stackwalk.rs:203
#5  0x000002aa020588b6 in wasmtime_internal_unwinder::throw::compute_throw_action<wasmtime::runtime::vm::throw::compute_throw_action::{closure_env#0}> () at crates/unwinder/src/throw.rs:60
#6  0x000002aa020613f8 in wasmtime::runtime::vm::throw::compute_throw_action () at crates/wasmtime/src/runtime/vm/throw.rs:120
#7  0x000002aa02387efc in wasmtime::runtime::vm::traphandlers::call_thread_state::CallThreadState::record_unwind () at crates/wasmtime/src/runtime/vm/traphandlers.rs:819
#8  0x000002aa0241cca2 in wasmtime::runtime::vm::traphandlers::catch_unwind_and_record_trap::{closure#1}<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::instance::{impl#0}::enter_host_from_wasm::{closure_env#0}<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::libcalls::raw::throw_ref::{closure_env#0}>> ()
    at crates/wasmtime/src/runtime/vm/traphandlers.rs:136
#9  0x000002aa0245e3b8 in wasmtime::runtime::vm::traphandlers::tls::with<(), wasmtime::runtime::vm::traphandlers::catch_unwind_and_record_trap::{closure_env#1}<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::instance::{impl#0}::enter_host_from_wasm::{closure_env#0}<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::libcalls::raw::throw_ref::{closure_env#0}>>> () at crates/wasmtime/src/runtime/vm/traphandlers.rs:1394
#10 0x000002aa02417ecc in wasmtime::runtime::vm::traphandlers::catch_unwind_and_record_trap<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::instance::{impl#0}::enter_host_from_wasm::{closure_env#0}<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::libcalls::raw::throw_ref::{closure_env#0}>> () at crates/wasmtime/src/runtime/vm/traphandlers.rs:136
#11 0x000002aa021466e0 in wasmtime::runtime::vm::instance::Instance::enter_host_from_wasm<core::result::Result<(), wasmtime::runtime::vm::traphandlers::TrapReason>, wasmtime::runtime::vm::libcalls::raw::throw_ref::{closure_env#0}> () at crates/wasmtime/src/runtime/vm/instance.rs:265
#12 0x000002aa02132a9a in wasmtime::runtime::vm::libcalls::raw::throw_ref () at crates/wasmtime/src/runtime/vm/libcalls.rs:125
#13 0x000075fbea7486cc in ?? ()
#14 0x000075fbea7480da in ?? ()
#15 0x000075fbea74813c in ?? ()
#16 0x000075fbea74821c in ?? ()
#17 0x000002aa016c9026 in wasmtime::runtime::vm::vmcontext::VMFuncRef::array_call_native () at crates/wasmtime/src/runtime/vm/vmcontext.rs:961
#18 0x000002aa016c8f66 in wasmtime::runtime::vm::vmcontext::VMFuncRef::array_call () at crates/wasmtime/src/runtime/vm/vmcontext.rs:918
#19 0x000002aa018f2082 in wasmtime::runtime::func::{impl#1}::call_unchecked_raw::{closure#0}<()> () at crates/wasmtime/src/runtime/func.rs:1025
#20 0x000002aa01102302 in wasmtime::runtime::vm::traphandlers::catch_traps::{closure#0}::call_closure<wasmtime::runtime::func::{impl#1}::call_unchecked_raw::{closure_env#0}<()>> ()
    at crates/wasmtime/src/runtime/vm/traphandlers.rs:463

Given that it's s390x-only though it's probably an endianness issue, perhaps something with a little-endian load needs to be native-endian? Or vice versa? Or maybe a store on the host needs to be little instead of native endian?

@cfallin
Copy link
Member Author

cfallin commented Aug 21, 2025

Yep, it's almost certainly endianness -- taking a look!

@cfallin
Copy link
Member Author

cfallin commented Aug 21, 2025

Actually, it's an issue with dynamic context reads during the stack-walk, it seems -- s390x ABI is a bit different (stackchains rather than FP-chains) so I suspect this is the issue.

@cfallin
Copy link
Member Author

cfallin commented Aug 21, 2025

s390x turned out to expose a mismatch in the definition of "spillslot offset" for the dynamic context -- the accessor I had exposed returns offset from the fixed storage area, which is ordinarily at SP+0 unless there is an outgoing args area. s390x always has an outgoing args area (per ABI). A win for ISA diversity wrt testing!

@github-actions github-actions bot added cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Aug 21, 2025
@cfallin cfallin added this pull request to the merge queue Aug 21, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 21, 2025
@alexcrichton alexcrichton added this pull request to the merge queue Aug 21, 2025
@github-actions
Copy link

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

Merged via the queue into bytecodealliance:main with commit 2d25f86 Aug 21, 2025
168 checks passed
@cfallin cfallin deleted the wasm-exceptions branch August 21, 2025 03:21
bongjunj pushed a commit to prosyslab/wasmtime that referenced this pull request Oct 20, 2025
…1441)

* Require a store in `catch_unwind_and_record_trap`

This commit does some preparatory refactoring for bytecodealliance#11326 to ensure that
a store is available when trap information is being processed. Currently
this doesn't leverage the new parameter but it should be leverage-able
in bytecodealliance#11326.

* Review comments
bongjunj pushed a commit to prosyslab/wasmtime that referenced this pull request Oct 20, 2025
* WebAssembly exception-handling support.

This PR introduces support for the [Wasm exception-handling proposal],
which introduces a conventional try/catch mechanism to WebAssembly. The
PR supports modules that use `try_table` to register handlers for a
lexical scope; and provides `throw` and `throw_ref` that allocate (in
the first case) and throw exception objects.

This PR builds on top of the work in bytecodealliance#10510 for Cranelift-level
exception support, bytecodealliance#10919 for an unwinder, and bytecodealliance#11230 for exception
objects built on top of GC, in addition a bunch of smaller fix and
enabling PRs around those.

[Wasm exception-handling proposal]: https://github.com/WebAssembly/exception-handling/

prtest:full

* Permit UnwindToWasm to have unused fields in Pulley builds (for now).

* Resolve miri-caught reborrowing issue.

* Ignore exceptions tests in miri for now (Pulley not supported).

* Use wasmtime_test on exceptions tests.

* Get tests passing on pulley platforms

* Add a check to `supports_host` for the generated test and assert
  failure also when that is false.
* Remove `pulley_unsupported` test as it falls out of `#[wasmtime_test]`
* Remove `exceptions_store` helper as it falls out of `#[wasmtime_test]`
* Remove miri annotations as they fall out of `#[wasmtime_test]`

* Remove dead import

* Skip some unsupported tests entirely in `#[wasmtime_test]`

If the selected compiler doesn't support the host at all then there's no
need to run it. Actually running it could misinterpret `CraneliftNative`
as "run with pulley" otherwise, so avoid such false negatives.

* Cranelift: dynamic contexts: account for outgoing-args area.

---------

Co-authored-by: Alex Crichton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure wasm-proposal:exceptions Issues for WebAssembly exceptions/exception-handling wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants