Deprecating UnwindSafe #3260

nico-abram · 2022-05-04T05:29:38Z

This RFC proposes to deprecate UnwindSafe.

Rendered RFC
Pre-RFC discussion on internals

BurntSushi · 2022-05-04T12:45:12Z

Thanks for writing this RFC.

As it stands, I don't think this RFC has enough detail in it yet. Most of the motivation section, for example, describes things that were explicitly addressed and known about in the RFC that stabilized catch_panic and the RFC that added the UnwindSafe bound to catch_panic. In particular, the background discusses the utility of "unwind safety" beyond just memory safety:

These mitigations all address the second aspect of exception unsafety: observation of broken invariants. With the tactics in place, it ends up being the case that safe Rust code can largely ignore exception safety concerns. That being said, it does not mean that safe Rust code can always ignore exception safety issues. There are a number of methods to subvert the mitigation strategies listed above:

When poisoning data across threads, antidotes are available to access poisoned data. Namely the PoisonError type allows safe access to the poisoned information.

Single-threaded types with interior mutability, such as RefCell, allow for sharing data across stack frames such that a broken invariant could eventually be observed.

Whenever a thread panics, the destructors for its stack variables will be run as the thread unwinds. Destructors may have access to data which was also accessible lower on the stack (such as through RefCell or Rc) which has a broken invariant, and the destructor may then witness this.

But all of these "subversions" fall outside the realm of normal, idiomatic, safe Rust code, and so they all serve as a "heads up" that panic safety might be an issue. Thus, in practice, Rust programmers worry about exception safety far less than in languages with full-blown exceptions.

Despite these methods to subvert the mitigations placed by default in Rust, a key part of exception safety in Rust is that safe code can never lead to memory unsafety, regardless of whether it panics or not. Memory unsafety triggered as part of a panic can always be traced back to an unsafe block.

In order to deprecate and effectively remove the notion of protecting against exception safety in Rust, I would expect, at minimum, that this RFC provide a comprehenseive argument against it in terms of the RFC that introduced it in the first place.

And with respect to the idea that "Unwind safety is not actually related to safety," it's important to note that we were under no illusions about this when we stabilized catch_panic. See also. Now to be fair, there was a substantial argument about whether to label catch_panic as unsafe. We weren't as fully committed to "unsafe is only for UB" back then I think, or at least, I don't think it had cemented into the shared values of all project participants yet, unlike today. And indeed, catch_panic (among other things, like the "leak apocalypse") is likely one of the drivers that led to the crisper definition of unsafe that we have today.

Responding to the other part of the motivation:

It can also be problematic when a type does not implement the marker trait, but it could, notably with trait objects (See discussion in rust-lang/rust#40628). It can also be a pain point for library authors, who are not sure if they should add a bound on them for their generic types to guarantee their types are UnwindSafe, which would make their downstream users sometimes have to use AssertUnwindSafe despite not using catch_unwind just to satisfy the bounds.

As someone who has been bitten by precisely this, I agree that there is pain here. But what I don't understand is why this means we have to deprecate the notion of exception safety itself. I suppose the obvious question is: how does the pain that library authors feel from this compare with the pain of writing code that isn't exception safe and getting bitten by it? To be fair, that seems like a very hard question to answer.

glaebhoerl · 2022-05-06T13:44:26Z

One useful piece of data would be what proportion of catch_unwind calls are/aren't using AssertUnwindSafe. The places where it's not being used are where the mechanism is providing value, by making it statically apparent that there's no potential exception safety issue. The places where it is being used, it's effectively just a HereBeDragons sign -- a purpose which, in its absence, could probably be served just as well by the catch_unwind call itself. (A followup question would be what proportion of AssertUnwindSafe use sites need to be using it, or for that matter, are using it correctly, but that seems harder to determine at the level of data.)

Whatever the benefit is, it can then be weighed against the cost. If the cost is just having to use AssertUnwindSafe in some subset of catch_unwind calls, it sounds small to me, given the presumable infrequency of those calls. In that light, I'm actually slightly confused about the enmity UnwindSafe has gathered: is catch_unwind used more often than I think? Or is the main cost in other areas?

From rust-lang/rust#40628, it seems like probably the latter: apparently people who aren't using catch_unwind themselves often need to take on UnwindSafe-related hassle. Could someone give more details or link an example of the kind of situation this currently arises in? (In the afore-linked thread, one pain point mentioned is that dyn Trait<Foo + RefUnwindSafe> couldn't even be written, but trying it out in the playground just now, that seems to have been fixed.)

Another complaint from that thread is that Send and Sync ought to logically imply UnwindSafe and RefUnwindSafe (respectively), but don't. How much of the pain does this account for -- would fixing it make most of the problems go away?

Anyway, it seems like another option alongside or subsuming some of those in the RFC would be "try to fix the problems with UnwindSafe", and address the specific complaints that people have raised:

Make Send imply UnwindSafe and Sync imply RefUnwindSafe
Improve the documentation... I'm not "a docs person", but leading with a quick concrete example (in code) of how things can go wrong as motivation, a brief segue laying out the contrast against normal safe code (where exception safety isn't an issue) and unsafe code (a different beast), and then introducing the solution (and how it catches the problem in the original example), seems like it could be a better structure.
Perhaps rename it, since having "Safe" in the name seems to be a persistent source of confusion. Maybe this could be done backwards compatibly if we let the standard library use a trait alias, or with a pub use self::UnwindSafe as NewName? (I'm not sure what the mechanics of that are.)

dpc · 2022-05-12T05:21:31Z

It's funny to see this mentioned in TWIR 5 years after I wrote rust-lang/rust#40628 . Personally I don't remember having to fight with unwind safety in these few years, but since nothing changed I still think it's kind of a broken feature.

I agree that this RFC could include more content, but generally I agree with the motion. I went through the history of these few slow-paced threads again, and I don't think anyone is all that into UnwindSafe. Mostly everyone acknowledge the issues, there are some opinions on what might be able to make it better, but that's it.

It's mostly just a lint. I did it many times in my code, that I though "wouldn't it be nice to have a type system enforcing some logical invariant", introduced it, and then decided that it's not worth the the type-system trouble. Sometimes it works and is worth it, sometimes it isn't. I think here the invariant has to be deal with so rarely in an idiomatic code, that it's just not worth the trouble.

There's multiple features that I would rather see Rust language developers spend their precious time on than trying to make this one actually work well, and since a lot of people suggests that we probably want to rename it anyway because it has nothing to do with safety, I think it's easiest to deprecate it completely for now, and if we ever really want to, we can introduce it from scratch, under new name. I guess it wouldn't buy us much if anything to try to fix + rename it, skipping deprecation.

LegionMammal978 · 2022-06-20T20:40:35Z

Another complaint from that thread is that Send and Sync ought to logically imply UnwindSafe and RefUnwindSafe (respectively), but don't.

Why should this logical implication hold? As I understand it, Send and Sync mean that an object can be sent or shared across threads without any of its interfaces causing UB. Meanwhile, UnwindSafe and RefUnwindSafe mean that an object can be sent or shared across an unwind boundary without silently exposing logic errors to the outside, which is a strictly stronger condition. Since inconsistent states from unwinding are not allowed to result in UB, not all !UnwindSafe types are !Send, and not all !RefUnwindSafe types are !Sync. For instance, a non-poisoning Mutex would be Sync + !RefUnwindSafe when it holds a Send value, since unwinding can expose logic errors but does not cause UB. This perspective has apparently been expressed before in this comment on rust-lang/rust#54768, to no reply.

bjorn3 · 2022-06-20T21:06:52Z

If you share an object between threads, you are already crossing unwind boundaries as unwinding stops at thread boundaries. IMO only Sync should imply UnwindSafe, as for Send sharing requires a Mutex, which implements poisoning to cause panics to propagate by default.

LegionMammal978 · 2022-06-21T12:54:01Z

If you share an object between threads, you are already crossing unwind boundaries as unwinding stops at thread boundaries. IMO only Sync should imply UnwindSafe, as for Send sharing requires a Mutex, which implements poisoning to cause panics to propagate by default.

The issue is, currently, neither Send nor Sync requires poisoning or panic safety. For example, parking_lot::Mutex, a popular implementation of lock_api::Mutex, is !RefUnwindSafe, since it does not implement any poisoning and always allows the user to silently retrieve the value. Thus, I suspect that any enforced Send: UnwindSafe or Sync: RefUnwindSafe relationship would either hamper Send/Sync or make [Ref]UnwindSafe meaningless.

I do not understand the reasoning for why Sync → UnwindSafe should hold. The type &mut i32 is Sync, since &&mut i32 can be safely shared across threads; however, it is !UnwindSafe, since it can trivially leak data past a catch_unwind() boundary. Could you elaborate on your reasoning?

bjorn3 · 2022-06-21T14:24:44Z

For example, parking_lot::Mutex, a popular implementation of lock_api::Mutex, is !RefUnwindSafe, since it does not implement any poisoning and always allows the user to silently retrieve the value.

I don't really like that it doesn't implement poisoning. It makes it way to easy to accidentally continue execution when a thread crashed. (Deliberately continuing is fine, but only deliberately)

The type &mut i32 is Sync, since &&mut i32 can be safely shared across threads; however, it is !UnwindSafe, since it can trivially leak data past a catch_unwind() boundary. Could you elaborate on your reasoning?

&&mut T is effectively equivalent to &&T, which should implement UnwindSafe for T: RefUnwindSafe (it doesn't currently as there is no impl <T: RefUnwindSafe> RefUnwindSafe for &T right now for some reason). I think &mut T should implement RefUnwindSafe, but not UnwindSafe. Just like it implements Sync and not Send.

LegionMammal978 · 2022-06-21T22:10:37Z

&&mut T is effectively equivalent to &&T, which should implement UnwindSafe for T: RefUnwindSafe (it doesn't currently as there is no impl <T: RefUnwindSafe> RefUnwindSafe for &T right now for some reason). I think &mut T should implement RefUnwindSafe, but not UnwindSafe. Just like it implements Sync and not Send.

I think I might not have made myself clear. &mut i32 is Sync, but it is not UnwindSafe. In your earlier comment, you suggested that T: Sync should always imply T: UnwindSafe, which is contrary to this example. I don't think such an implication makes much sense.

As an exercise, I've made a table with an example of (nearly) every combination of Send, Sync, UnwindSafe, and RefUnwindSafe. Recall that std::sync types implement poisoning, but parking_lot types do not.

 Send +  Sync +  USafe +  RUSafe: i32
 Send +  Sync +  USafe + !RUSafe: parking_lot::Mutex<i32>
 Send +  Sync + !USafe +  RUSafe: &mut i32
 Send +  Sync + !USafe + !RUSafe: &parking_lot::Mutex<i32>
 Send + !Sync +  USafe +  RUSafe: sync::RwLock<Cell<i32>>
 Send + !Sync +  USafe + !RUSafe: Cell<i32>
 Send + !Sync + !USafe +  RUSafe: &mut sync::RwLock<Cell<i32>>
 Send + !Sync + !USafe + !RUSafe: &mut Cell<i32>
!Send +  Sync +  USafe +  RUSafe: sync::MutexGuard<'_, i32>
!Send +  Sync +  USafe + !RUSafe: {?}
!Send +  Sync + !USafe +  RUSafe: &mut sync::MutexGuard<'_, i32>
!Send +  Sync + !USafe + !RUSafe: parking_lot::MutexGuard<'_, i32>
!Send + !Sync +  USafe +  RUSafe: Rc<i32>
!Send + !Sync +  USafe + !RUSafe: Cell<Rc<i32>>
!Send + !Sync + !USafe +  RUSafe: &mut Rc<i32>
!Send + !Sync + !USafe + !RUSafe: &Cell<i32>

Perhaps this might help others establish the difference. Personally, I don't see much of a pattern, apart from &mut making a type !USafe but otherwise leaving it alone. Something that seems very odd to me is that sync::Mutex<T> and sync::RwLock<T> are USafe and RUSafe, even when T is !USafe. This means that data can be leaked through a sync::Mutex<&mut T> by writing to the reference, since the poisoning would not carry through to the T itself.

bjorn3 · 2022-06-22T10:53:19Z

I think I might not have made myself clear. &mut i32 is Sync, but it is not UnwindSafe. In your earlier comment, you suggested that T: Sync should always imply T: UnwindSafe, which is contrary to this example. I don't think such an implication makes much sense.

Right, Sync should imply RefUnwindSafe.

glaebhoerl · 2022-06-29T17:22:55Z

@LegionMammal978

Why should this logical implication hold? As I understand it, Send and Sync mean that an object can be sent or shared across threads without any of its interfaces causing UB. Meanwhile, UnwindSafe and RefUnwindSafe mean that an object can be sent or shared across an unwind boundary without silently exposing logic errors to the outside, which is a strictly stronger condition. Since inconsistent states from unwinding are not allowed to result in UB, not all !UnwindSafe types are !Send, and not all !RefUnwindSafe types are !Sync. For instance, a non-poisoning Mutex would be Sync + !RefUnwindSafe when it holds a Send value, since unwinding can expose logic errors but does not cause UB. This perspective has apparently been expressed before in rust-lang/rust#54768 (comment) on rust-lang/rust#54768, to no reply.

Thank you for raising this point so clearly! Amusingly, when I went back just now to look for the old discussions about making the Atomic types implement RefUnwindSafe, which I recall as having been edifying, I found someone else making more or less the same point you are now, and that someone was me. I have no recollection of this.

There are two different kinds of justifications for why a type should be (Ref)UnwindSafe, which come from something like opposite directions. One of them is the straightforward and intuitive one: that the type doesn't let you observe broken invariants across a catch_unwind() boundary. And indeed, there are many Send/Sync types which don't also satisfy this criterium.

The less intuitive one comes from how the types are used, with the observation that using a type in a concurrency-safe way is harder than using it in an unwind-safe¹ way, and any code that's concurrency-safe is therefore inherently unwind-safe as well. The Atomic types are a particularly vivid demonstration. What kind of code wants to use Atomic types directly? Subtle and complicated lock-free algorithms. Why are those algorithms subtle and complicated? Because any thread might read or modify the Atomic variable at any point, so to maintain consistency, all of its associated invariants have to hold 100% of the time, not temporarily relaxed even for a nanosecond. Now, if your invariants hold literally 100% of the time, then that includes all the times when the code might panic and be observed by a catch_unwind().

But I think the same idea holds true for Sync types more generally. The problem you're trying to avoid with regards to both concurrency-safety and unwind-safety is that you relax an invariant temporarily, and then, before you get to restore it, some other code observes it. With concurrency, the other code is running at the same actual time, or perhaps the scheduler interrupts your code to let a different thread run. With unwinding, a panic interrupts your code and lets the catch_unwind() handler run. It's basically a one-time context switch that can only happen at specific points (not necessarily self-evident ones, however).

So in any case where there's an opportunity for catch_unwind() to observe your broken invariant, there's also an opportunity for a concurrently-running thread to observe it. Consider the example of the non-poisoning Mutex. The point here is precisely that if some code panics in the middle of a critical section, while an invariant is temporarily relaxed, then that code is buggy even if it doesn't use catch_unwind() anywhere! Because another thread could also observe that inconsistent state.

From this perspective, the purpose of UnwindSafe is to offer some protection in situations where catch_unwind() is what makes the difference with respect to whether or not there is a bug. In the absence of catch_unwind(), code that panics in the middle of shuffling things around between Cells is completely fine.² Doing the same with Atomics is already not fine.

Now, all this relies on the assumption that concurrency is the only reason why anyone would ever use a Sync shared-mutable type like Mutex or Atomics, and that if concurrency weren't involved, they'd use some corresponding non-Sync type like RefCell instead. This is admittedly not a logically airtight assumption, and not one to be staking the soundness of the standard library on. But since UnwindSafe isn't enforcing a soundness-related invariant, and is making its best effort to catch other bugs, then in a spirit of pragmatism, it seems like a sensible assumption to make use of.

(One particular failure mode³ one might be concerned about is that previously, when such types weren't UnwindSafe, the only reason to use them was for concurrent code; but once they are, maybe someone might use them just to satisfy an UnwindSafe constraint, without concurrency actually being involved anywhere. While this scenario seems more amusing than likely, I think there's a way to logically reconcile it nonetheless: we should conceive of types like Mutex and AtomicPtr as holding an AssertUnwindSafe internally. The goal is for there to be some marker of intent, and an otherwise-gratuitous use of a Sync type is such a marker.)

@dpc

I did it many times in my code, that I though "wouldn't it be nice to have a type system enforcing some logical invariant", introduced it, and then decided that it's not worth the the type-system trouble. Sometimes it works and is worth it, sometimes it isn't. I think here the invariant has to be deal with so rarely in an idiomatic code, that it's just not worth the trouble.

There's multiple features that I would rather see Rust language developers spend their precious time on than trying to make this one actually work well

This is very reasonable. I don't think the tradeoffs are super clear right now w.r.t. how much effort it would take to improve UnwindSafe⁴, how much pain it's causing, or how much value it's providing (prior comments raised these questions, and apart from yours, haven't attracted much response). But if the relevant teams came out tomorrow and said that in their best estimation, it doesn't pencil out, and they don't feel like it's worth their effort, then that would be a solid motivation for deprecation. But if that's in fact the case, then I think it's what an RFC should be about, rather than focus on object-level technical issues.

I think it's easiest to deprecate it completely for now, and if we ever really want to, we can introduce it from scratch, under new name

Removing the UnwindSafe constraint from catch_unwind() would be backwards compatible; adding a new FunwindSafer constraint afterwards would not be. (Features like trait aliases or pub use may or may not be adequate for renaming it in-place in a backwards compatible way. If they're not, then the renaming issue is moot.)

Panic-safe, cancel-safe, kill-safe, abort-safe... this property has many names. ↩
I mean, it'll exit, but it won't go haywire. ↩
This has got to be a named logical fallacy or similar, but I'm drawing a blank. ↩
For the Send/Sync thing in particular, the sticking point was coherence. Since these are all marker traits, however, overlap ought to be fine, and in fact at one point, there was an effort to relax the rules for marker traits. That ran aground on some subtle issue where it turned out to not be fine after all. As best as I can remember, the realization they ended up coming to was that while it's fine to relax the overlap restrictions for marker traits, it's not fine to relax the orphan restrictions (due to issues w.r.t. negative reasoning). I'm not sure which of those is the relevant one for this situation. But it's at least conceivable that the given restriction could be lifted in a second push, which would be beneficial on its own terms, and then the UnwindSafe impls we want could go through without a problem. Only lang/compiler folks could say for sure. ↩

LegionMammal978 · 2022-07-01T13:51:08Z

So in any case where there's an opportunity for catch_unwind() to observe your broken invariant, there's also an opportunity for a concurrently-running thread to observe it. Consider the example of the non-poisoning Mutex. The point here is precisely that if some code panics in the middle of a critical section, while an invariant is temporarily relaxed, then that code is buggy even if it doesn't use catch_unwind() anywhere! Because another thread could also observe that inconsistent state.

This reasoning works for atomics, but I don't think it follows for Mutexes. The whole point of the type is that threads' critical sections are mutually exclusive. So if a thread acquires a lock, relaxes an invariant, then restores the invariant before releasing the lock, nothing else in the program will be able to observe the relaxed invariant. In other words, every thread knows that once it acquires a lock, the value within will always satisfy the invariant.

But panics potentially break this last assumption: if a thread panics, it can release its lock before it can restore the invariant, and the next thread to acquire a lock will observe the intermediate state. Thus, we implement poisoning, so that the next thread knows that a panic must have happened in a critical section and that the value must be handled with caution. The alternative is a control-flow explosion, in which every thread to acquire the lock must assume that some other thread may have just panicked.

One particular failure mode one might be concerned about is that previously, when such types weren't UnwindSafe, the only reason to use them was for concurrent code; but once they are, maybe someone might use them just to satisfy an UnwindSafe constraint, without concurrency actually being involved anywhere.

I have recently observed this while reviewing a library that allows users to write Rust plugins for a certain single-threaded C application. Each plugin has some number of hooks that can be triggered by specific events; the hooks communicate via a captured environment. However, what should happen if one hook panics? There's a few obvious possibilities:

Print a message and abort the program. The library author deemed this undesirable under any circumstances.
Print a message and unregister the panicked plugin. The application's plugin API did not permit this.
Print a message and unregister all hooks. The library author rejected this for (IMO) poorly-explained reasons.
Print a message and replace all hooks with error stubs. The library author similarly rejected this.

Thus, the author decided to keep all hooks running as before, including the hook that panicked. However, to avoid the control-flow explosion, they required the hook closures to be UnwindSafe, and recommended that shared state be placed in a Mutex for its poisoning. If there were such thing as a poisoning RefCell, that would probably be used for the state instead.

glaebhoerl · 2022-07-01T15:27:08Z

This reasoning works for atomics, but I don't think it follows for Mutexes.

I'm not sure if you were just pointing out a technical detail, or if you intended this as a broader counterargument...?

Indeed, while the mutex is locked, its state is protected from other threads, and this critical section can be interrupted by a panic. At that point there are two possibilities. Either it doesn't permit the state to be accessed again (whether by poisoning¹, or just keeping it locked), or it does (by unlocking). If it doesn't, the first kind of justification for unwind safety applies -- there's no way to observe a violated invariant. If it does, the second kind does -- now that it's unlocked, another thread can access the state, and this is a bug whether or not catch_unwind() is being used as well.

If there were such thing as a poisoning RefCell, that would probably be used for the state instead.

That's an interesting idea, and I guess it makes sense that this use case would come up.

Poisoning can be circumvented explicitly, but that's the same kind of "I'm aware of the implications of what I'm doing" marker as AssertUnwindSafe would be. ↩

LegionMammal978 · 2022-07-01T21:38:58Z

I'm not sure if you were just pointing out a technical detail, or if you intended this as a broader counterargument...?

My apologies; I hadn't seen that you were specifically writing about "the non-poisoning Mutex" in that section. As I understand it, it comes down to the fact that thread::spawn() requires the closure to be Send, but not UnwindSafe, so any cross-thread Sync interaction introduces the same risk of observing broken invariants, and the bound on catch_unwind() doesn't help us at all.

However, there would still seem to be a valuable distinction between poisoning and non-poisoning types. Poisoning types follow our "correct by default" philosophy, since most users aren't expected to carefully track the state of the program following an unwinding panic, and putting up a PoisonError roadblock correctly informs them of the risk. But non-poisoning types also have their niche, having less overhead and being somewhat simpler to reason about in applications where panics don't occur or can't disrupt important invariants. (Personally, I suspect users would worry much less about non-poisoning types if .lock().unwrap() were any shorter.) Currently, we use UnwindSafe to distinguish these, but it would be relegated to documentation alone in a "Send/Sync implies [Ref]UnwindSafe" world. Would we just have to accept that tradeoff?

Kixunil · 2022-08-11T21:35:02Z

It seems to me that part of the problem may be missing tools to prove the code is actually unwind safe.
For instance Cell<T> is missing RefUnwindSafe impl. The type prevents getting a reference to the interior so it's impossible to invalidate its invariants.

Further, adding these new types could help:

/// Implements poisoning API on top of `&mut` - no shared references involved
pub struct PoisoningCell<T> {
    value: T,
    is_poisoned: bool,
}

impl<T> PoisoningCell<T> {
    pub fn new(value: T) -> Self {
        PoisoningCell {
            value,
            is_poisoned: false,
        }
    }
    pub fn set(&mut self, value: T) {
        self.value = value;
        // we've just overwrote the value with something valid
        self.is_poisoned = false;
    }
    pub fn replace(&mut self, value: T) -> T { self.ck_poison(); mem::replace(&mut self.value, value) }
    pub fn get(&self) -> T where T: Copy { self.ck_poison(); self.value }
    pub fn get_ref(&self) -> &T where T: RefUnwindSafe { self.ck_poison(); &self.value }
    pub fn get_mut(&mut self) -> Guard<'_, T> { self.ck_poison(); Guard(self) }
    pub fn into_inner(self) -> T { self.ck_poison(); self.value }
    fn ck_poison(&self) { assert!(!self.is_poisoned, "attempted access to poisoned cell"); }
}


// Consumer can only replace the values or access them with possible poisoning, so these are OK
impl<T> RefUnwindSafe for RefCell<PoisoningCell<T>> {}
impl<T> UnwindSafe for &'_ mut PoisoningCell<T> {}

// impl Deref{Mut} like other guards
pub struct Guard<'a, T>(&'a mut PoisoningCell);

/// Doesn't allow mutable references to T to exist but allows overwriting with new value
pub struct DummyCell<T>(T);

impl<T> DummyCell<T> {
    pub fn new(value: T) -> Self { DummyCell(value) }
    pub fn set(&mut self, value: T) { self.0 = value; }
    pub fn replace(&mut self, value: T) -> T { mem::replace(&mut self.0, value) }
    pub fn get(&self) -> T where T: Copy { self.0 }
    pub fn get_ref(&self) -> &T where T: RefUnwindSafe { &self.0 }
    pub fn into_inner(self) -> T { self.0 }
}

impl<T> RefUnwindSafe for DummyCell<T> {}
// Consumer can only replace the values or read them, so this is OK
impl<T> UnwindSafe for &'_ mut DummyCell<T> {}

glaebhoerl · 2022-09-11T16:36:57Z

(Sorry for the lag, I had a hard time wrapping my head around the different perspective and got diverted while trying.)

As I understand it, it comes down to the fact that thread::spawn() requires the closure to be Send, but not UnwindSafe, so any cross-thread Sync interaction introduces the same risk of observing broken invariants, and the bound on catch_unwind() doesn't help us at all.

I guess you could look at it this way.

If thread::spawn() did require UnwindSafe, that'd result in there being no use for types which were Send/Sync but not [Ref]UnwindSafe, even if they were technically possible; so I think you were just making an observation, rather than wishing this were the case?

But non-poisoning types also have their niche, having less overhead and being somewhat simpler to reason about in applications where panics don't occur or can't disrupt important invariants. (...) Currently, we use UnwindSafe to distinguish these, but it would be relegated to documentation alone in a "Send/Sync implies [Ref]UnwindSafe" world. Would we just have to accept that tradeoff?

By using a non-poisoning mutex, someone declares "I take upon myself the burden of ensuring there are no panics [at problematic points] in my critical sections". I agree it would be nice if there were some more upfront indication that this is what's happening (as with unsafe blocks, or AssertUnwindSafe). But since catch_unwind() doesn't even enter into that picture, the presence or absence of a RefUnwindSafe impl doesn't do a good job of serving as that indication -- it essentially is merely documentation, unless the author, by coincidence, happens to also employ a catch_unwind() involving the same mutex somewhere. So yeah, I think we'd have to "accept the tradeoff"... except that the thing we'd be giving up is something we never really had.

As for what *else* could be done about the situation...

Effect typing for panics might've come in useful here, but we don't have that. Perhaps the MutexGuard for such types could abort on panic... which is even more extreme than poisoning (although does avoid the performance hit, which I believe was part of the motivation). Or they could simply keep the Mutex locked, which again accomplishes basically the same thing as poisoning, with the tradeoff being a worse debugging experience. Perhaps the MutexGuard could have an .unlock_on_panic() flag you'd set if you've verified that any potential panics are benign ones, or there could be a separate flavor of MutexGuard type with this behavior, in either case providing thereby the upfront indication we were wishing for above. Or you could use catch_unwind() and do the unlocking yourself, I suppose.

Or maybe the issue in some cases is that the type inside the Mutex doesn't have any invariants? In which case any panic is fine. (This scenario is plausible, since some kind of synchronization is necessary just to avoid data races (satisfy Sync), even if there are no invariants to preserve.) I'm not sure what a sensible and systematic way to indicate this assumption might be.

glaebhoerl · 2022-09-11T16:48:28Z

For instance Cell<T> is missing RefUnwindSafe impl. The type prevents getting a reference to the interior so it's impossible to invalidate its invariants.

Technically true as long as you only have a single Cell anywhere. Once you have multiple Cells, there can be invariants between them. (Which is exactly the kind of thing that's perfectly fine for single-threaded code as long as catch_unwind() isn't involved, and why we're motivated to track unwind safety when it is.)

Kixunil · 2022-09-12T12:02:30Z

@glaebhoerl same applies to Atomic* which already have RefUnwindSafe.

bjorn3 · 2022-09-12T12:14:13Z

Cell and Atomic* don't have any internal invariants that unwinding could violate. If there are external invariants, then the types that contain them which do have such invariants should not implement UnwindSafe.

Kixunil · 2022-09-12T12:30:15Z

@bjorn3 indeed. There's a footgun that people may forget about it and also the possibility. I'm not sure if it's possible to improve that but maybe some way to mark the invariants would be helpful.

glaebhoerl · 2022-09-12T20:00:58Z

@glaebhoerl same applies to Atomic* which already have RefUnwindSafe.

Yes, because the use case for that is concurrency! I wrote an entire ~~novel~~ comment about this earlier!

WiktorPrzetacznik · 2023-10-27T21:10:39Z

I feel like the only goal of unwind safety (being an useful lint) hasn't really been achieved, because there's no consistency in implementing UnwindSafe and RefUnwindSafe even in std (!). Scanning through whole std and fixing it seems like a lot of effort for little gains. Also, contributors have to keep in mind that any future code modifications may invalidate type's unwind safety.

In my opinion, there's no place for half-baked features like this one in std.

nikomatsakis · 2023-10-28T11:22:43Z

As one of the original people who argued strongly in favor of UnwindSafe, I agree it has not been successful in its goals. I think it's a powerful tool that could in principle help detect bugs, but in practice I think it's not really used consistently enough nor well understood enough.

I'm not 100% sure what the takeaware is here on my part -- I still believe strongly that unwind safety is subtle and hard to get right, and I believe in a std that makes efforts to be footgun-proof against these sorts of failures, but many of the attempts to do so seem to not quite hit the mark (I'm still a fan of lock poisoning, but I know it's unpopular; my personal belief is most code is still broken in the face of panics while a lock is held, and the default API should simply have panicked implicitly rather than returning a Result, and then it would be far more popular).

ssokolow · 2023-10-28T11:40:28Z

(I'm still a fan of lock poisoning, but I know it's unpopular; my personal belief is most code is still broken in the face of panics while a lock is held, and the default API should simply have panicked implicitly rather than returning a Result, and then it would be far more popular).

I disagree, but then I'm one of those people who wraps every unit of work in catch_unwind, was willing to spawn "unnecessary" threads in the days before catch_unwind, sees uncaught panics as something not that far below non-FFI use of unsafe (without miri CI and a giant benchmark suite to justify it) as a code smell and constantly laments that Rustig! and findpanics are both unmaintained while #[no_panic] is a linker hack with too much emphasis on the "hack".

nikomatsakis · 2023-10-29T13:15:25Z

I agree we should provide better tools for catching uncaught panics: but note that, if we do that, and you never have a panic while holding a lock, then poisoning is not something you need to think about. In other words I would rather focus on how we catch the initial panic. Anyway, (mostly) off topic -- but the point stands that having effective mechanisms for preventing panics (and being correct in the face of them) is important. To that end I still believe UnwindSafe was attacking a real problem, but I also see that it seems to be ineffective. I am not entirely sure why that is, as I said.

…

On Sat, Oct 28, 2023, at 7:40 AM, Stephan Sokolow wrote: > (I'm still a fan of lock poisoning, but I know it's unpopular; my personal belief is most code is still broken in the face of panics while a lock is held, and the default API should simply have panicked implicitly rather than returning a Result, and then it would be far more popular). > I disagree, but then I'm one of those people who wraps every unit of work in `catch_unwind`, was willing to spawn "unnecessary" threads in the days before `catch_unwind`, sees uncaught panics as something not *that* far below non-FFI use of `unsafe` without miri CI and a giant benchmark suite to justify it as a code smell and constantly laments that Rustig! and findpanics are both unmaintained while `#[no_panic]` is a linker hack with too much emphasis on the "hack". — Reply to this email directly, view it on GitHub <#3260 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZQW4TZMCKKWHH2BQALYBTVLPAVCNFSM5VA3WZHKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZYGM3TQNRUHAYA>. You are receiving this because you commented.Message ID: ***@***.***>

ssokolow · 2023-10-29T13:20:58Z

True. I'm just not sure how radical a change it would have involved to developer ergonomics to make the benefit of poisoning negligible. There is, after all, a reason that Rust doesn't require every API which may allocate to return a Result.

Panicking, poisoning, catch_unwind around units of work, and tools like Rustig! or findpanics are Defense in Depth against broken behaviour slipping into a release due to a developer mistake.

NobodyXu · 2023-10-29T21:30:22Z

Maybe it would be reasonable to introduce noexcept?

bjorn3 · 2023-10-29T23:46:54Z

That would mean you can't do any integrity assertions in such functions, which may well be even worse than requiring catch_unwind at a safe place where you can recover from violations of invariants. And it would have all the same composibility issues as any other effect system we have in rust (const fn, async fn, ...)

glaebhoerl · 2023-11-04T18:21:59Z

@nikomatsakis Do you happen to have any thoughts w.r.t. my earlier comment where I tried to figure out what actual practical problems people are having with UnwindSafe, and whether they could be addressed?

(Also maybe this footnote on a subsequent comment, w.r.t. what it would take for the Send/Sync->Ref/UnwindSafe blanket impls to go through.)

rust-lang/rust#40628, rust-lang/rust#65717 and rust-lang/rfcs#3260 all show that unwind safety isn't particularly ergonomic to use and implement, and ultimately leads to people slapping `AssertUnwindSafe` everywhere until the compiler stops complaining. This situation has led to built-in test framework using `catch_unwind(AssertUnwindSafe(...))` (see https://github.com/rust-lang/rust/blob/1.73.0/library/test/src/lib.rs#L649) and libraries like tower-http doing the same (see https://docs.rs/tower-http/0.4.4/src/tower_http/catch_panic.rs.html#198). As people have mentioned in the threads above, trying to implement this correctly is akin to fighting windmills at the moment. Since the above cases demonstrated that `catch_unwind(AssertUnwindSafe(...))` is currently the easiest way to deal with this situation, this commit does the same and refactors our background job runner code accordingly.

Our invariants are so complicated and need such special handling that once they are take care of, `UnwindSafe` adds nothing. The particular example that motivated this: `&mut T` isn't `UnwindSafe` because it's *easy* to witness a broken invariant, not that you will. See rust-lang/rfcs#3260 for further rationale.

kornelski · 2024-10-17T16:14:52Z

Problems I have with UnwindSafe:

Unlike unsafe code, this isn't explicitly marked as !UnwindSafe by the author. The unwind-unsafety state is guessed by the compiler based on general heuristics of using certain suspicious types.

As a result, when finding that some code is not satisfying the UnwindSafe bound, I can't tell whether that's on purpose, and the code is actually not reliable in presence of panics, or whether it's just an oversight, and the compiler is overreacting. In practice, it always turns out to be the latter, so UnwindSafe is hard to take seriously.
A risk of having an application-specific logic bug due to specifically !UnwindSafe types doesn't feel substantially different from a risk having any other application-specific bug due to other unexpected errors. Rust seems to over-emphasise one specific set of edge-case conditions. There are UnwindSafe types like Atomic that could be easily left in a bad state. There's also out-of-Rust state, like open files or TCP connections that could be left in a broken state too. That's why Rust libraries use a lot of Drop guards, because state cleanup calls can be easily forgotten, even without unwinding.
The notion of "logical unsafety" is too vague. It's not a language requirement, but an entirely speculative "maybe the program will do something safe-but-wrong, but who knows?". When the problems and consequences are so nebulous, it's hard to even talk about this feature. This feature doesn't seem to be actually used by anyone, and there are no prominent examples in the ecosystem to use as a reference, that I'm aware of. This makes it too hard define a policy for using this feature within a project/company — other than YOLO it with AssertUnwindSafe.
I can't come up with a case where I'd want to have code that is not UnwindSafe. If I'm aware that some code can become buggy beucause of a panic, I want to fix it anyway. The existing best practices and solutions required by Rust to make code safe are basically the same as for making code UnwindSafe, so typically fixing one is fixing both. I don't want to fix code just barely enough not to cause UB, but still leave it broken enough to be dangerous, especially knowing that UnwindSafe is not taken seriously and is easily bypassed.

LegionMammal978 · 2024-10-17T23:46:58Z

@kornelski How I'd consider it, is that UnwindSafe is somewhat of a misnomer if one takes it to mean "values of this type are safe to unwind": obviously every value ought to be sound on unwind, at the very least. Really, what it means is, "either values of this type allow the modification of data outside of the usual ownership tree, or the type and everything depending on it is designed so that any such modification is atomic w.r.t. panics". The idea is, if you use only naturally-UnwindSafe types across a catch_unwind() boundary, then all side effects of your code will be encapsulated within values that it owns. Then, if it unwinds at any point, then all those owned values will all be destroyed, and you'll be left with a clean slate on the outside of the catch_unwind(). The archetypical example being that a server might want to isolate different requests from each others' panics: when a request panics, all of that request's owned resources are cleaned up, and the server can continue serving other requests with no harm done. To address your particular points:

That's why I prefer my reframing of the trait in terms of the ownership tree. It's not a general heuristic of suspiciousness so much as the meaning of an UnwindSafe type, even if its documentation meanders around the point. This also gives a straightforward way to evaluate whether a type should be UnwindSafe: if I use this value to modify anything, then will the modification be wiped clean (or at least be safely "committed" in some way) once I drop the value? For typical data types (String, Vec, Option, ...) this is the case, but it is usually not the case for types granting mutable access to other data (&mut T, &Cell<T>, ...).
UnwindSafe really depends on the existence of Drop guards for safely cleaning up owned objects, rather than trying to be a substitute for Drop guards. For Atomic types, the idea (as I understand it) is that any multi-step operation you'd try to do that a panic could break would generally already be broken in the presence of multiple threads. Except maybe for custom mutexes, but that's covered by UnsafeCell being !RefUnwindSafe. For out-of-Rust data like files and sockets, the idea is that any breakage caused by unexpected panics would also be caused by "unplugging the computer in the middle of the operation", so a carefully-designed upper protocol should already be tolerant of that (with idempotence, transactions, etc.).
Yeah, I don't deny that the documentation and usability of UnwindSafe aren't very great in practice, and that a blanket AssertUnwindSafe is far simpler. If I had to define a policy, I might say, "Consider what the scope and aim of this catch_unwind() block are. Then, consider any side effects of the code that aren't yet cleaned up once it's unwound. Address those side effects, then apply UnwindSafe to the relevant data." In particular, for any catch_unwind() block where the panic is immediately re-raised (e.g., on an FFI boundary), a blanket AssertUnwindSafe is 100% correct, since that part of the program is being taken down by an uncaught panic in any case.
It's more about generic types like &mut T that might allow such buginess in some use cases: any possible &mut WhateverType accessed with a TOCTOU gap (e.g., creating a HashMap entry with a dummy value, then subsequently filling it with some calculated data) might cause such a bug. If you know that your particular type is airtight when accessed across an unwind boundary, you can slap an UnwindSafe/RefUnwindSafe on it with no problem. Again, it's not about the sorts of errors that a drop guard is supposed to fix, but about the errors from partial operations across a catch_unwind() boundary.

I'm not really opposed to this PR, since UnwindSafe as implemented is very impractical, but I think there's some merit to the underlying concepts.

nico-abram added 2 commits May 4, 2022 02:23

Deprecating UnwindSafe

a9d07d1

Update PR number

e10ce23

ehuss added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label May 4, 2022

zachs18 mentioned this pull request Nov 13, 2022

Tracking Issue for Exclusive rust-lang/rust#98407

Open

5 tasks

Turbo87 mentioned this pull request Nov 6, 2023

worker/swirl/runner: Simplify AssertUnwindSafe usage rust-lang/crates.io#7453

Merged

fzyzcjy mentioned this pull request Jan 1, 2024

Discussions about UnwindSafe/RefUnwindSafe fzyzcjy/flutter_rust_bridge#1543

Closed

Amanieu mentioned this pull request May 22, 2024

Wrap Context.ext in AssertUnwindSafe rust-lang/rust#125392

Merged

ok-nick mentioned this pull request Jun 3, 2024

Implement Debug w/ detailed manifest for Reader contentauth/c2pa-rs#473

Merged

3 tasks

workingjubilee mentioned this pull request Jun 6, 2024

Remove UnwindSafe bound from pgrx_extern_c_guard pgcentralfoundation/pgrx#1732

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecating UnwindSafe #3260

Deprecating UnwindSafe #3260

nico-abram commented May 4, 2022 •

edited

Loading

BurntSushi commented May 4, 2022 •

edited

Loading

glaebhoerl commented May 6, 2022

dpc commented May 12, 2022 •

edited

Loading

LegionMammal978 commented Jun 20, 2022 •

edited

Loading

bjorn3 commented Jun 20, 2022

LegionMammal978 commented Jun 21, 2022 •

edited

Loading

bjorn3 commented Jun 21, 2022 •

edited

Loading

LegionMammal978 commented Jun 21, 2022 •

edited

Loading

bjorn3 commented Jun 22, 2022

glaebhoerl commented Jun 29, 2022

LegionMammal978 commented Jul 1, 2022

glaebhoerl commented Jul 1, 2022

LegionMammal978 commented Jul 1, 2022

Kixunil commented Aug 11, 2022

glaebhoerl commented Sep 11, 2022

glaebhoerl commented Sep 11, 2022

Kixunil commented Sep 12, 2022

bjorn3 commented Sep 12, 2022

Kixunil commented Sep 12, 2022

glaebhoerl commented Sep 12, 2022

WiktorPrzetacznik commented Oct 27, 2023

nikomatsakis commented Oct 28, 2023

ssokolow commented Oct 28, 2023 •

edited

Loading

nikomatsakis commented Oct 29, 2023 via email

ssokolow commented Oct 29, 2023 •

edited

Loading

NobodyXu commented Oct 29, 2023

bjorn3 commented Oct 29, 2023

glaebhoerl commented Nov 4, 2023

kornelski commented Oct 17, 2024

LegionMammal978 commented Oct 17, 2024 •

edited

Loading

Deprecating UnwindSafe #3260

Are you sure you want to change the base?

Deprecating UnwindSafe #3260

Conversation

nico-abram commented May 4, 2022 • edited Loading

BurntSushi commented May 4, 2022 • edited Loading

glaebhoerl commented May 6, 2022

dpc commented May 12, 2022 • edited Loading

LegionMammal978 commented Jun 20, 2022 • edited Loading

bjorn3 commented Jun 20, 2022

LegionMammal978 commented Jun 21, 2022 • edited Loading

bjorn3 commented Jun 21, 2022 • edited Loading

LegionMammal978 commented Jun 21, 2022 • edited Loading

bjorn3 commented Jun 22, 2022

glaebhoerl commented Jun 29, 2022

Footnotes

LegionMammal978 commented Jul 1, 2022

glaebhoerl commented Jul 1, 2022

Footnotes

LegionMammal978 commented Jul 1, 2022

Kixunil commented Aug 11, 2022

glaebhoerl commented Sep 11, 2022

glaebhoerl commented Sep 11, 2022

Kixunil commented Sep 12, 2022

bjorn3 commented Sep 12, 2022

Kixunil commented Sep 12, 2022

glaebhoerl commented Sep 12, 2022

WiktorPrzetacznik commented Oct 27, 2023

nikomatsakis commented Oct 28, 2023

ssokolow commented Oct 28, 2023 • edited Loading

nikomatsakis commented Oct 29, 2023 via email

ssokolow commented Oct 29, 2023 • edited Loading

NobodyXu commented Oct 29, 2023

bjorn3 commented Oct 29, 2023

glaebhoerl commented Nov 4, 2023

kornelski commented Oct 17, 2024

LegionMammal978 commented Oct 17, 2024 • edited Loading

nico-abram commented May 4, 2022 •

edited

Loading

BurntSushi commented May 4, 2022 •

edited

Loading

dpc commented May 12, 2022 •

edited

Loading

LegionMammal978 commented Jun 20, 2022 •

edited

Loading

LegionMammal978 commented Jun 21, 2022 •

edited

Loading

bjorn3 commented Jun 21, 2022 •

edited

Loading

LegionMammal978 commented Jun 21, 2022 •

edited

Loading

ssokolow commented Oct 28, 2023 •

edited

Loading

ssokolow commented Oct 29, 2023 •

edited

Loading

LegionMammal978 commented Oct 17, 2024 •

edited

Loading