ABI-stabilize `core::task::Waker` #3404

p-avital · 2023-03-28T19:07:22Z

Implementing FFI-safe futures for stabby has given me the opportunity to find out just how hard Wakers are to deal with when attempting to pass futures across the FFI boundary.

This RFC aims to make dealing with them not only much easier, but also much more performant than currently possible, removing a tree-sized splinter from the thumb of projects that need to pass futures across the FFI boundary.

Disclaimer: this RFC proposes breaking API for completeness' sake only: my firm opinion is that Rust's backward compatibility guarantees should be protected, and the proposed non-breaking solution's runtime cost, while existent, is likely to become negligible as soon as the common executors switch to the newly proposed constructor for waker vtables.

Rendered

workingjubilee · 2023-03-28T20:34:01Z

A library break in types is, generally, strictly unacceptable, and has already been foreclosed by previous RFCs, unless the matter concerns a soundness violation or it is the only path forward for a language evolution. Editions are not permitted to change the standard library API in the way you propose, because of the Epoch RFC that defined them:

To be crystal clear: Rust compilers must support all extant editions, and a crate dependency graph may involve several different editions simultaneously. Thus, editions do not split the ecosystem nor do they break existing code.

p-avital · 2023-03-28T20:36:18Z

A library break in types is, generally, strictly unacceptable, and has already been foreclosed by previous RFCs, unless the matter concerns a soundness violation or it is the only path forward for a language evolution. Editions are not permitted to change the standard library API in the way you propose, because of the Epoch RFC that defined them:

To be crystal clear: Rust compilers must support all extant editions, and a crate dependency graph may involve several different editions simultaneously. Thus, editions do not split the ecosystem nor do they break existing code.

Then it's a good thing that's not the one I consider best (or even good). The proposed implementation PR opts for the stable union implementation

workingjubilee · 2023-03-28T20:40:51Z

Indeed. I have thoughts about that as well but I figured I would give you a heads up that an RFC proposing anything that even resembles a breakage (or even a stabilization that cannot later be undone!) should be responsive and consider the concerns enumerated in RFC 2502, RFC 1105, and RFC 1122.

p-avital · 2023-03-28T22:55:42Z

I'll edit the RFC to remove the API break option, since it really isn't one :)

oxalica

I don't like the idea to freeze Waker fns to extern "C" fn.

Enforcing extern "C" in vtable benefits only FFI scenario. More usual non-FFI code doesn't really benefits from it.
Instead, non-FFI code needs to write extra extern "C", and are deoptimized due to implementation limitation: extra indirection call, missed extern-Rust specific future optimization and etc.
It also introduces future compatibility concerns, since we would never change the ABI anymore even after edition bumps (ABI-stabilize core::task::Waker #3404 (comment)).

oxalica · 2023-03-29T12:12:41Z

text/0000-stable-wakers.md

+
+Futures have now become a core feature of Rust, and returning boxed futures from trait-methods has become a very common pattern when working with `async`.
+
+However, due to Rust's unstable default ABI, trait-objects are never FFI-safe (note that this doesn't stop users trying). Multiple crates (such as [`abi_stable`](https://crates.io/crates/abi_stable) and [`stabby`](https://crates.io/crates/stabby)) attempt to solve this problem by providing proc-macros that generate stable v-tables for the traits they annotate.


It's worth mentioning async-ffi which covers exactly the FFI-compatible waker scenario.

Thanks, I wasn't aware of this one, I shall look it up this instant :)

After checking it out, it uses the first technique I envisioned for stabby: allocating a new Waker on poll. stabby now only allocates on the first call to waker.clone()

oxalica · 2023-03-29T12:15:04Z

text/0000-stable-wakers.md

+		unsafe extern "C" fn clone_adapter(clone: unsafe fn(*const ()) -> RawWaker, data: *const ()) -> RawWaker {
+			clone(data)
+		}
+		unsafe extern "C" fn other_adapter(other: unsafe fn(*const ()), data: *const ()) {
+			other(data)
+		}


This introduces indirection penalty to all existing ordinary async code. It sounds unacceptable to me.

Unfortunately, I see no other way of guaranteeing that calling unsafe fn is safe to call as soon as futures can come from the FFI (even if they themselves are ABI-stable).

I think the penalty is negligible considering that the adapters will likely stay in cache, and wakers typically have work that involves either locking or lock-less data structures, both being likely to strongly outweigh the cost of that indirection. I arguably don't have any measure to back this assumption up.

I think the penalty is negligible considering that the adapters will likely stay in cache, and wakers typically have work that involves either locking or lock-less data structures, both being likely to strongly outweigh the cost of that indirection. I arguably don't have any measure to back this assumption up.

It's still not zero-cost for majority non-FFI users.

Also if you think extra indirection is acceptable, then so does the indirection penalties of current vtable wrapping technique. But that is one of reasons why you want this RFC.

I will study async-ffi further to see if their technique bypasses that, but my issue with the current waker isn't the indirection so much as the need to allocate upon cloning the waker... I'll come back to you once I've explored this again.

In the case of async-ffi, the allocation happens as soon as poll is called, regardless of whether poll would've clone the waker or not.

text/0000-stable-wakers.md

p-avital · 2023-03-29T16:32:16Z

I don't like the idea to freeze Waker fns to extern "C" fn.

Enforcing extern "C" in vtable benefits only FFI scenario. More usual non-FFI code doesn't really benefits from it.

Instead, non-FFI code needs to write extra extern "C", and are deoptimized due to implementation limitation: extra indirection call, missed extern-Rust specific future optimization and etc.

I'm not particularly attached to extern "C", as long as the calling convention is stable, any would do, but I'm no expert in these. Would fastcall be good? it should keep the *const () in register which, I think, is as good as we can have it. Nothing I can do about indirection with this proposal though...

It also introduces future compatibility concerns, since we would never change the ABI anymore even after edition bumps (#3404 (comment)).

Don't all constructors on structures that may at some point require new fields that can't be defaulted have the same future compatibility concern? Arguably this one is more subtle due to the fact that fields shouldn't be reordered once done, but that's a minor concern with fields this uniform.

Alternative proposal (throwing spaghetti at the wall here):

Wakers could be made generic over the vtable layout

trait WakerVTable {
  type CloneFn: Fn(*const ()) -> RawWaker<Self>;
  type OtherFn: Fn(*const ());
  unsafe fn clone_fn(&self) -> Self::CloneFn;
  unsafe fn wake_fn(&self) -> Self::OtherFn;
  unsafe fn wake_by_ref_fn(&self) -> Self::OtherFn;
  unsafe fn drop_fn(&self) -> Self::OtherFn;
}
// RawWaker => RawWaker<VTable: WakerVTable = RawWakerVTable>
// Waker => Waker<VTable: WakerVTable = RawWakerVTable>
// Context => Context<VTable: WakerVTable = RawWakerVTable>
// Future => Future<VTable: WakerVTable = RawWakerVTable>
impl WakerVTable for RawWakerVTable {
  type CloneOutput = RawWaker;
  type CloneFn = fn(*const ()) -> Self::CloneOutput; // not sure how to have the unsafe here...
  type OtherFn = fn(*const ());
  unsafe fn clone_fn(&self) -> Self::CloneFn {unsafe {crate::mem::transmute(self.clone)}
  // ...
}

This way, it's possible to design new vtable layouts, the state machine generation can stay entirely identical, code that uses Waker and Future stays identical, but is only compatible with the current, FFI-unsafe wakers.

But that way, new code can be written to support other v-table types, such as StableWakerVTable, which could use any chosen calling conventions and repr(C).

Support for FFI-safe wakers would be incremental: executors would need to have new methods to spawn futures, which would have to either be stored separately or in an enum; hand-written futures would need to add the new generic to their impl of Future to be compatible with the FFI-safe wakers. But overall, I don't think this breaks anything, and doesn't cost anything to projects that are OK with not being FFI-safe.

p-avital · 2023-03-29T17:40:00Z

@oxalica could you weigh in on that "alternative proposal" just above? This would even let alternative vtable layout such as the stable one be provided by external crates, removing the need for core to actually have a stable vtable.

Combined with waker_getters, this would give all the flexibility required to do plenty of nice things.

I might need mentoring to implement a prototype though, because the future_trait lang item requires 0 generics, preventing me from trying to make Future<VTable> on my branch, and I have no idea how to deal with that (this being my first attempt at contributing to Rust, I'm completely new to the process, sorry)

(also, sorry for pinging you, I'm not sure if you'd rather I not in the future, do let me know)

clarfonthey · 2023-03-29T18:44:30Z

This is an extremely out-there suggestion, but perhaps there could longer-term be some mechanism for Rust to interpret certain Rust-ABI functions as actually being another ABI, and modifying compilation accordingly?

Basically, since Rust's ABI is explicitly unstable (and probably will be forever, instead relying on explicit ABI annotation for stable ABIs), we essentially have free reign to choose the ABI depending on the circumstances. So, we could have the ability to say "actually, the functions inside Waker should use the C ABI internally" so that functions without the C ABI are the ones which incur the branch penalty, and C-ABI functions are free.

Of course, this is probably not a very useful optimisation, but I figured I'd ponder here if that would be a potential solution.

p-avital · 2023-03-29T18:58:46Z

@clarfonthey Doing this automatically sounds weird to me, but I'm currently writing another, much more ambitious (to ambitious for me to stand any chance of implementing it without a lot of mentoring) RFC that's similar: what if when running rustc, you could ask it to use a given layout/ABI combination for all things that don't have an explicitly specified one? This would let people instantly make all of their library ffi-safe, at the cost of losing out on the latest optimizations. As better layouts and ABI are found, and provided the compiler team is willing to support them, this would let people use the most recently stabilized efficient layouts globally, instead of needing to mark all types down to core like they currently have to

workingjubilee · 2023-03-30T00:32:43Z

We currently have -Zrandomize-layouts, so there's some precedent for "flags that do goofy things to Rust layouts". I'm not sure a "rewrite as this other stable one" would be approved as a stable feature, but...

I'm not particularly attached to extern "C", as long as the calling convention is stable, any would do, but I'm no expert in these. Would fastcall be good? it should keep the *const () in register which, I think, is as good as we can have it. Nothing I can do about indirection with this proposal though...

extern "fastcall" is the x86 Microsoft calling convention, it has nothing to do with anything else, honestly (and thus isn't necessarily all-that well-defined on other platforms...), and it's prooobably a bit of a bug that you can use it when Windows isn't involved.

p-avital · 2023-03-30T06:22:39Z

@workingjubilee Thanks for kinda confirming my suspicion, since most of the calling conventions listed in the nomicon are Microsoft calling conventions, I was wondering which ones could really be treated as stable, and how the Microsoft ones were translated on other systems.

I see you've reacted to the Future<VTable: WakerVTable =core::task::RawWakerVTable> proposal, I actually think Future<Waker: Wake + Clone = core::task::Waker> would much simpler (and would allow some ZST shenanigans too).

There's a papercut: assuming you have a Future that's completely generic over the Waker (be it handwritten or generated by an async block), functions that return them opaquely would need an extra generic argument to support the new generic Waker:

fn returns_impl_future<Waker>() -> impl Future<W> {...}
fn returns_boxed_future<Waker>() -> Box<dyn Future<W>> {...}

Presumably, async fn could keep on returning an anonymous type that has all the impls, so this shouldn't be too much of an issue, but would this be an issue for async-traits?

Still, that seems to me like an actually much better proposal than this current one, as it actually expands the scope of how executors would be allowed to work. Should this be a new RFC?

p-avital · 2023-03-30T12:06:28Z

This is kinda off-topic for this RFC, but just throwing another idea out there: right now, the documentation states that ABI is unstable including between two runs of the same compiler on the same machine:

How far from the current truth are people who assume that despite the guarantee not being stated, two runs of the same compiler on the same machine will produce ABI-compatible binaries? Do we know what factors influence:
a. Layout generation: under what circumstances would the same struct be compiled with differing layouts?
b. Calling conventions: what influences the calling convention of fn? Argument number/size?
How feasible would it be for the compiler to summarize those factors?

The point I'm trying to reach is: would the compiler be able to provide a form of hash which, barring hash collision, would identify the way it would represent types as well as what ABI it would use for fn?

If such a hash were to exist, we could at least make the "blessed compiler" approach truly safe: for example, Rust could (provided it was asked to) add an undefined symbol _rustc_abi_hash_<hash> to all cdylibs for which the improper_cdecl_type lint triggers, and always define this symbol in executables, such that linkage would fail if there's a mismatch.

How to have Rust consistently use the same ABI is possibly a whole other can of worms, but this could allow safe usage of FFI-unsafe Rust across the FFI-boundary, even if producing artifacts that match may be complex/unfeasible depending on the version/flags used.

programmerjake · 2023-03-30T18:30:58Z

If such a hash were to exist, we could at least make the "blessed compiler" approach truly safe: for example, Rust could (provided it was asked to) add an undefined symbol _rustc_abi_hash_<hash> to all cdylibs for which the improper_cdecl_type lint triggers, and always define this symbol in executables, such that linkage would fail if there's a mismatch.

that would imho improperly limit existing cdylibs that do stuff like so:

#[derive(Debug, Default)]
pub struct MyState { // type acts like extern type to all users of this cdylib
    state: String, // something that triggers improper_cdecl_type
}

#[no_mangle]
pub extern "C" fn alloc_my_state() -> Box<MyState> {
    Box::new(MyState::default())
}

#[no_mangle]
pub extern "C" fn free_my_state(_v: Option<Box<MyState>>) {}

#[no_mangle]
pub extern "C" fn modify_my_state(this: &mut MyState, a: u32) {
    this.state += char::from_u32(a).unwrap();
}

#[no_mangle]
pub unsafe extern "C" fn debug_my_state(this: &MyState, f: unsafe extern "C" fn(*const u8, usize, *mut ()), f_state: *mut ()) {
    let s = format!("{this:?}");
    unsafe { f(s.as_ptr(), s.len(), f_state) }
}

p-avital · 2023-03-30T21:13:59Z

If such a hash were to exist, we could at least make the "blessed compiler" approach truly safe: for example, Rust could (provided it was asked to) add an undefined symbol _rustc_abi_hash_<hash> to all cdylibs for which the improper_cdecl_type lint triggers, and always define this symbol in executables, such that linkage would fail if there's a mismatch.

that would imho improperly limit existing cdylibs that do stuff like so:
#[derive(Debug, Default)]
pub struct MyState { // type acts like extern type to all users of this cdylib
    state: String, // something that triggers improper_cdecl_type
}

#[no_mangle]
pub extern "C" fn alloc_my_state() -> Box<MyState> {
    Box::new(MyState::default())
}

#[no_mangle]
pub extern "C" fn free_my_state(_v: Option<Box<MyState>>) {}

#[no_mangle]
pub extern "C" fn modify_my_state(this: &mut MyState, a: u32) {
    this.state += char::from_u32(a).unwrap();
}

#[no_mangle]
pub unsafe extern "C" fn debug_my_state(this: &MyState, f: unsafe extern "C" fn(*const u8, usize, *mut ()), f_state: *mut ()) {
    let s = format!("{this:?}");
    unsafe { f(s.as_ptr(), s.len(), f_state) }
}

Indeed, Box<MyState> would trigger the lint and thus the symbol insertion... Then just providing some way of inserting the symbol (like a macro) and/or a const in core would do the trick, but require explicit declaration on both sides, which is fine I think

workingjubilee · 2023-03-31T02:28:06Z

How far from the current truth are people who assume that despite the guarantee not being stated, two runs of the same compiler on the same machine will produce ABI-compatible binaries? Do we know what factors influence:
a. Layout generation: under what circumstances would the same struct be compiled with differing layouts?
b. Calling conventions: what influences the calling convention of fn? Argument number/size?

Optimization levels can impact effective ABI.
Sorry you had to find out this way. 😅

workingjubilee · 2023-03-31T02:41:12Z

since most of the calling conventions listed in the nomicon are Microsoft calling conventions, I was wondering which ones could really be treated as stable, and how the Microsoft ones were translated on other systems.

As far as I know, for a given architecture, specifying one of the Windows ABIs either uses the specified Windows calling convention, if there is a version of that Windows CC for the architecture, or it just... doesn't make any sense at all and may produce... unexpected results.

p-avital · 2023-03-31T07:43:36Z

Optimization levels can impact effective ABI. Sorry you had to find out this way. 😅

That's actually exactly what I expected, thanks for confirming that, imma gonna add that to our canary for our projects (that currently use the "blessed compiler" approach: the fact that our user only complain when they use a different version of the compiler leads me to think we got lucky a lot with how they set opt-level).

Any other things that spring to mind (excluding the layout randomization flag)? If not, do we know that's all of it? Because if we know compiler-version + opt-level are the only things beside explicit randomization that could cause ABI changes with the same signatures, the "perfect linkage canaries" (those that would give 100% guarantee of ABI-agreement) are actually doable today as a crate 😄

Request for mentorship

About the generic future thing: I'd love to give it a try, but the future lang-feature requires that there be exactly 0 generics on the trait. I tried changing that requirement in compiler/rustc_hir/src/lang_items.rs, but since the library is built first, that doesn't work without staging the build. I suppose the staging would look like:

Change requirement for Future to GenericRequirement::Minimum(0),
Run x.py build
Make x.py aware that I want to use the result of that build as the compiler for the next build (that's the part I need help on, I have no idea how to do that 😢)
Add the generics to Future in the library
Profit locally, look for things that the generic might break despite the default being there.
Figure out the workflow to make that happen on the main repo should an RFC on Generic Futures be approved

programmerjake · 2023-03-31T08:19:14Z

Make x.py aware that I want to use the result of that build as the compiler for the next build (that's the part I need help on, I have no idea how to do that cry)

generally what's done is by using #[cfg(bootstrap)] in the library, that is disabled when compiling the library with your modified compiler and enabled when compiling with the bootstrap compiler (probably beta)

e.g. here a new intrinsic is only used when #[cfg(not(bootstrap))] since then the compiler has the new code to understand it:
https://github.com/rust-lang/rust/blob/eb3e9c1f45981b47160543cfd882ca00e69bbfab/library/core/src/option.rs#L777-L793

workingjubilee · 2023-03-31T08:19:58Z

Any other things that spring to mind (excluding the layout randomization flag)?

...uh. Whether the function is publicly visible or not. ( I think not in the Rust sense but in the "exported from the binary" sense? ) There's an optimization pass that LLVM runs that can change the calling convention of functions that are known to not be publicly exposed.

p-avital · 2023-03-31T08:40:50Z

...uh. Whether the function is publicly visible or not. ( I think not in the Rust sense but in the "exported from the binary" sense? ) There's an optimization pass that LLVM runs that can change the calling convention of functions that are known to not be publicly exposed.

Makes sense, I wouldn't even rely on a non-publicly visible function existing in the binary (since it could also have been inlined on every call-site).

Sorry for pestering you so much. Hopefully I'll pay you back for your time by making the Rust-to-Rust-dynlink experience better by trying to regroup all of this :)

p-avital added 2 commits March 28, 2023 20:57

propose stable wakers

f93719a

fix typos

ab4e66c

p-avital mentioned this pull request Mar 28, 2023

Implement stable-wakers RFC rust-lang/rust#109706

Closed

ehuss added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Mar 29, 2023

remove API-break variant of the proposal to avoid polluting the debate

b6609e1

p-avital mentioned this pull request Mar 29, 2023

Tracking Issue for waker_getters rust-lang/rust#96992

Closed

3 tasks

update PR links

00224b0

oxalica reviewed Mar 29, 2023

View reviewed changes

Remove padding from the vtable

41b0f52

p-avital mentioned this pull request May 22, 2023

Generic Futures #3434

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ABI-stabilize `core::task::Waker` #3404

ABI-stabilize `core::task::Waker` #3404

p-avital commented Mar 28, 2023 •

edited by rustbot

Loading

workingjubilee commented Mar 28, 2023

p-avital commented Mar 28, 2023 •

edited

Loading

workingjubilee commented Mar 28, 2023

p-avital commented Mar 28, 2023

oxalica left a comment

oxalica Mar 29, 2023

p-avital Mar 29, 2023

p-avital Mar 29, 2023

oxalica Mar 29, 2023

p-avital Mar 29, 2023

oxalica Mar 29, 2023

p-avital Mar 29, 2023

p-avital Mar 29, 2023

p-avital commented Mar 29, 2023 •

edited

Loading

p-avital commented Mar 29, 2023

clarfonthey commented Mar 29, 2023

p-avital commented Mar 29, 2023

workingjubilee commented Mar 30, 2023 •

edited

Loading

p-avital commented Mar 30, 2023

p-avital commented Mar 30, 2023

programmerjake commented Mar 30, 2023

p-avital commented Mar 30, 2023

workingjubilee commented Mar 31, 2023

workingjubilee commented Mar 31, 2023

p-avital commented Mar 31, 2023 •

edited

Loading

programmerjake commented Mar 31, 2023

workingjubilee commented Mar 31, 2023 •

edited

Loading

p-avital commented Mar 31, 2023


		Futures have now become a core feature of Rust, and returning boxed futures from trait-methods has become a very common pattern when working with `async`.

		However, due to Rust's unstable default ABI, trait-objects are never FFI-safe (note that this doesn't stop users trying). Multiple crates (such as [`abi_stable`](https://crates.io/crates/abi_stable) and [`stabby`](https://crates.io/crates/stabby)) attempt to solve this problem by providing proc-macros that generate stable v-tables for the traits they annotate.

ABI-stabilize core::task::Waker #3404

Are you sure you want to change the base?

ABI-stabilize core::task::Waker #3404

Conversation

p-avital commented Mar 28, 2023 • edited by rustbot Loading

workingjubilee commented Mar 28, 2023

p-avital commented Mar 28, 2023 • edited Loading

workingjubilee commented Mar 28, 2023

p-avital commented Mar 28, 2023

oxalica left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

p-avital commented Mar 29, 2023 • edited Loading

Alternative proposal (throwing spaghetti at the wall here):

p-avital commented Mar 29, 2023

clarfonthey commented Mar 29, 2023

p-avital commented Mar 29, 2023

workingjubilee commented Mar 30, 2023 • edited Loading

p-avital commented Mar 30, 2023

p-avital commented Mar 30, 2023

programmerjake commented Mar 30, 2023

p-avital commented Mar 30, 2023

workingjubilee commented Mar 31, 2023

workingjubilee commented Mar 31, 2023

p-avital commented Mar 31, 2023 • edited Loading

Request for mentorship

programmerjake commented Mar 31, 2023

workingjubilee commented Mar 31, 2023 • edited Loading

p-avital commented Mar 31, 2023

ABI-stabilize `core::task::Waker` #3404

ABI-stabilize `core::task::Waker` #3404

p-avital commented Mar 28, 2023 •

edited by rustbot

Loading

p-avital commented Mar 28, 2023 •

edited

Loading

p-avital commented Mar 29, 2023 •

edited

Loading

workingjubilee commented Mar 30, 2023 •

edited

Loading

p-avital commented Mar 31, 2023 •

edited

Loading

workingjubilee commented Mar 31, 2023 •

edited

Loading