Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Borrows: do we even want protectors? #372

Open
RalfJung opened this issue Oct 29, 2022 · 4 comments
Open

Stacked Borrows: do we even want protectors? #372

RalfJung opened this issue Oct 29, 2022 · 4 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-dereferenceable Topic: when exactly does a reference need to point to regular dereferenceable memory? C-open-question Category: An open question that we should revisit

Comments

@RalfJung
Copy link
Member

Stacked Borrows 'protectors' are a mechanism ensuring that references passed to a function (including inside ADTs, Cc #125) must outlive that function -- if they are being invalidated while the function still runs, we have immediate UB.

In my view this is by far the most surprising UB in Stacked Borrows that I don't see a good fix for. Most of the other issues, in particular around mutable references prematurely invalidating things and raw pointers being too limited in the range of memory they can access, are fixable either without impacting the basic reordering optimizations, or only impacting some of the more obscure ones (such as moving a write up across an unknown function call without there already being a write before that call).

Protectors, however, are needed for all reorderings that move accesses down across unknown function calls -- even reads:

fn foo(x: &i32) -> i32 {
  let val = *x;
  unknown();
  return val; // can we return `*x` here, and not use a register for `val`?
}

Without protectors, unknown could just invalidate x by writing to it (through another alias that has write permissions), and there'd be no UB from that. (If there is a 2nd read of x after the call to unknown, then even without protectors we can optimize assuming both reads return the same value. It is only optimizations that extend the liveness of x that need protectors.)

Furthermore, protectors are used to justify the dereferenceable attribute in LLVM, which indicates that the reference is dereferenceable for the entire duration of foo. LLVM has a long-standing plan of also adding support for an attribute which means that x is only dereferenceable when foo starts running, but no such attribute has landed yet -- I guess they are struggling with keeping the code quality up under that weaker assumption, but @nikic might know more. It definitely becomes a lot harder to analyze foo if unknown were allowed to deallocate x.

So as of today, it seems we are faced with a hard choice: either we have some super subtle UB, or we lose the dereferenceable attribute and make it a lot harder for the compiler to reason about references after an unknown function got called. On the one hand I'd like to make unsafe code authors life easier by not putting the burden of such subtle rules on them, on the other hand I don't want to pessimize optimizations in all code (safe and unsafe) just to enable some really obscure barely needed patterns.

I'm wondering what others think here, and also what evidence we might have that could help us decide one way or the other.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 2, 2022

@pcwalton you did a bunch of work on the LLVM side recently to improve the quality of the generated code -- do you have a feeling how much we are benefiting from the dereferenceable attribute, and how much we'd lose if we

  • removed the attribute entirely everywhere
  • replaced the attribute by a (currently still hypothetical) dereferenceable_on_entry?

@RalfJung RalfJung added C-open-question Category: An open question that we should revisit A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) labels Nov 2, 2022
@pcwalton
Copy link

pcwalton commented Nov 2, 2022

We're benefiting from dereferenceable a lot. I would expect maybe an across-the-board double digit percentage slowdown if dereferenceable were removed. I don't know off the top of my head what dereferenceable_on_entry would look like; when would the argument move from dereferenceable to not dereferenceable? After a function call?

@RalfJung
Copy link
Member Author

RalfJung commented Nov 2, 2022

@pcwalton thanks! When we come to actually gathering the rationale here, it'd be nice to do some actual benchmarks.

I don't know off the top of my head what dereferenceable_on_entry would look like; when would the argument move from dereferenceable to not dereferenceable? After a function call?

The compiler could assume that the argument was dereferenceable when the function starts, but then if it wants to still treat it as derefernceable later it needs to prove that it has not been deallocated since then. So any time it calls a function it has to argue why that cannot have freed this reference.


That said I think there might be another reason we need protectors... LLVM noalias on function arguments says that these pointers may not have aliasing accesses in this scope. However under Stacked Borrows without protectors, the following would be legal even if both arguments alias:

fn foo(x: &mut i32, x_alias: *mut i32) {
  *x = 5;
  *x_alias = 0;
}

We'd basically say that the dynamic lifetime of x ends before the assignment to *x_alias.

So at least for the noalias part of the aliasing model, we need some sort of protector -- we do want to be able to reorder these two writes, after all. If we wanted to relax the requirement for dereferenceable we'd have to do something else... we could say we allow deallocation through x itself, i.e. "pointers that can be used for writing may also be used for deallocation" or so. I don't see any reasonable way to do this for shared references.

@RalfJung RalfJung added the A-dereferenceable Topic: when exactly does a reference need to point to regular dereferenceable memory? label Jun 6, 2023
@RalfJung
Copy link
Member Author

RalfJung commented Aug 2, 2023

Also see this Zulip thread that discusses the need for protectors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-dereferenceable Topic: when exactly does a reference need to point to regular dereferenceable memory? C-open-question Category: An open question that we should revisit
Projects
None yet
Development

No branches or pull requests

2 participants