-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Single-entry / multiple-exit regions for borrows #396
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,307 @@ | ||
- Start Date: 2014-10-12 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
|
||
The regions used for lifetime inference and borrow checking should be general | ||
single-entry / multiple-exit regions, instead of being limited to the current | ||
situation of lexical scopes, or even more general single-entry / single-exit | ||
regions. | ||
|
||
# Motivation | ||
|
||
There are three classes of program fragments that are currently rejected by the | ||
borrow checker that really seem like they should be accepted. | ||
|
||
1. The borrow checker ignores liveness of borrows, so that a borrow is valid for | ||
the entirety of a lexical scope, even after its last use. This even occurs in | ||
straight-line code For example, a program like this is rejected: | ||
|
||
```rust | ||
let mut i = 0i; | ||
let p = &mut i; | ||
*p = 1; | ||
let q = &mut i; | ||
*q = 2; | ||
``` | ||
|
||
The borrow checker should be able to realize that each of these borrows is no | ||
longer valid at the point of the next borrow. The workaround for this is to give | ||
each borrow its own lexical scope, e.g. | ||
|
||
```rust | ||
let mut i = 0i; | ||
{ | ||
let p = &mut i; | ||
*p = 1; | ||
} | ||
{ | ||
let q = &mut i; | ||
*q = 2; | ||
} | ||
``` | ||
|
||
This is unnecesarily verbose and difficult to explain to new programmers. It | ||
also doesn't handle all cases involving mutable borrows of multiple locations. | ||
|
||
2. The borrow checker can't refine borrow information based on the branches of | ||
a pattern match. The canonical example is a 'find-or-insert' pattern: | ||
|
||
```rust | ||
match map.find(&key) { | ||
Some(...) => { ... } | ||
None => { | ||
map.insert(key, new_value); | ||
} | ||
} | ||
``` | ||
|
||
In an `Option<&T>` the `&T` only appears in the the `Some` constructor, so the | ||
borrow shouldn't need to be considered in the `None` branch. The workaround for | ||
this is usually to do a clone/copy and two separate lookups'. This is more | ||
verbose, doesn't work in all cases, and is potentially a large performance | ||
penalty. | ||
|
||
3. Borrows in rvalues of the same overall expression often overlap when it | ||
doesn't seem that they have to. For example, if `a` has methods `foo`, `bar` | ||
and `baz` that all take `&mut self` parameters, but the latter two have no | ||
lifetimes in output position, then an expression such as | ||
|
||
```rust | ||
a.foo(a.bar(), a.baz()) | ||
``` | ||
|
||
seems like it should be valid. The workaround is to instead write | ||
|
||
```rust | ||
let bar = a.bar(); | ||
let baz = a.baz(); | ||
a.foo(bar, baz); | ||
``` | ||
|
||
All this is doing is making the evaluation of temporaries explicit, so it | ||
shouldn't affect whether the program is accepted. | ||
|
||
While all of these errors arise in the borrow checker, they are actually caused | ||
by a deficiency in Rust's lifetime system. Currently, Rust models lifetimes as | ||
lexical scopes, and borrows must last for a specific lifetime. | ||
|
||
The way to fix these problems is to refine the notion of lifetime. The obvious | ||
first step would be to generalize from lexical scopes to arbitrary single-entry | ||
/ single-exit regions, but this would only fix the first and third of the | ||
problems mentioned above. To fix the second problem, which is arguably the most | ||
frustrating in actual use because workarounds are difficult or nonexistent, the | ||
notion of lifetimes in Rust needs to be refined to include multiple exit points. | ||
|
||
# Background | ||
|
||
Unfortunately, there is no obvious candidate for a notion of a single-entry / | ||
multiple-exit region to use from the literature. We assume that a Rust function | ||
is viewed as a control-flow graph, with a single entry point and a single return | ||
point (multiple return points can be handled with additional edges). This | ||
includes the expansion of temporary rvalue evaluation. This is already done for | ||
other reasons in `rustc`. | ||
|
||
It's easier to first think about the single-entry / single-exit case first. If R | ||
is a set of vertices in a CFG, the most obvious definition is that R is a | ||
single-entry / single-exit region if the following conditions hold: | ||
|
||
1. There is a vertex Entry in R such that for all vertices V in R, every path | ||
from the function entry to V contains Entry. | ||
|
||
2. There is a vertex Exit in R such that for all vertices V in R, every path | ||
from V to the function exit contains Exit. | ||
|
||
If we were to incorporate this definition of regions into Rust lifetimes, then | ||
the following program fragment would be accepted: | ||
|
||
```rust | ||
let mut a = Some(0u); | ||
let p = a.as_ref().unwrap(); | ||
loop { | ||
println!("{}", *p); | ||
a = None; | ||
if ... { | ||
break; | ||
} | ||
} | ||
``` | ||
|
||
The lifetime of the borrow associated with `p` would be a single-entry / | ||
single-exit region with Entry at the call to `as_ref` and Exit at the last | ||
use of `p`. Therefore the borrow checker would find nothing wrong with the | ||
mutation right below the last use, even though this leads to memory unsafety | ||
at runtime. | ||
|
||
The problem here is that the region's Entry and Exit aren't always matched, i.e. | ||
there are paths from the function entry to the function exit that go through | ||
Entry a different number of times than they go through Exit. If we add this | ||
condition, then we arrive at the standard definition of a SESE region in the | ||
compiler literature. This is equivalent to saying that Entry and Exit belong | ||
to the same cycles in the CFG. | ||
|
||
Adopting SESE regions for Rust lifetimes would be sound. Moreover, SESE regions | ||
form a lattice, where meet is intersection and join is the least region | ||
containing the union of two regions. Since a single vertex forms its own region, | ||
this also covers extending a region to contain a point. Since these are the | ||
operations that the Rust type checker needs to infer lifetimes, it would be | ||
possible to compute lifetimes in this manner without much difficulty. | ||
|
||
It is possible to extend this notion of a region to have multiple exit points | ||
instead of just one, but the details of the definition are a bit subtle (e.g. | ||
we would have to track exit edges rather than exit vertices), and unfortunately | ||
it becomes very difficult to compute the lattice operations given a | ||
representation of two regions in terms of entry and exit points. The problem is | ||
that when shrinking or extending a region, there is no easy way to compute the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is shrinking/extending part of the lifetime analysis implementation in rustc? if it was instead implemented to gather entry/exit points then solve them all at once, would this make the trivial extension (defining exits as graph edges, rather than dominated vertices) feasible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the lifetime inference algorithm requires extending regions to include a new point, taking meet and join of regions, etc. Any lifetime inference algorithm based on any of the standard approaches to inference with subtyping will require this. |
||
new exit edges. | ||
|
||
We don't actually need the exit edges of regions in the Rust compiler until | ||
after type checking, when the dataflow analysis used for borrow checking inserts | ||
borrow kill flags on exits from regions. In the type checker itself, we could | ||
use any representation we want. Because of the difficulty of finding an | ||
efficient sparse representation in terms of entry and exit points, this RFC | ||
proposes a definition of single-entry / multiple-exit region that doesn't use | ||
exit points at all. | ||
|
||
# Detailed Design | ||
|
||
This definition is inspired somewhat by the work done on liveness analysis and | ||
register allocation of programs in [static single assignment form](http://en.wikipedia.org/wiki/Static_single_assignment_form). | ||
|
||
If R is a set of vertices in the CFG, we say that R is a *single-entry / | ||
multiple-exit region* if the following conditions hold: | ||
|
||
1. There is a vertex Entry in R such that for all vertices V in R, every path | ||
from the function entry to V contains Entry. | ||
|
||
2. If V is in R and P is a path from Entry to V that does not leave R and | ||
reenter through Entry, then every vertex of P is in R. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sounds tautological. "If [...] P is a path [...] that does not leave R [...], then every vertex of P is in R". Surely the condition "P is a path [...] that does not leave R" is identical to "P is a path where every vertex of P is in R", which makes this "If P is a path where every vertex of P is in R, then every vertex of P is in R". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I first came up with the more technical definition below and then tried to make a plain English counterpart. What I was attempting to capture was the part about reentering through Entry. For example, in a program fragment like loop {
[Entry]
[Exit]
} there is allowed to be a path from Entry to Exit, back around the outer loop that reenters Entry, without all of the vertices on that path being in the region. On second thought, it seems that this would be fine:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea of counting uses of Entry seems a little odd. Could you not use something like the following:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That seems equivalent and a bit simpler, so I'll go with that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or the more general definition, which subsumes the first condition:
Instead of evaluating paths only from the CFG entry to V, evaluating from all nodes not in R allows you to avoid doing any counting. |
||
|
||
The second condition is to ensure that in the case of a program fragment like | ||
|
||
```rust | ||
let mut i = 0i; | ||
let p = &mut i; | ||
if ... { | ||
(use of p) | ||
} else { | ||
(no use of p) | ||
} | ||
drop(p); | ||
``` | ||
|
||
that the region corresponding to the borrow in `p` includes both branches of the | ||
`if`. | ||
|
||
This definition is difficult to work with directly, but it is possible to | ||
reformulate it to make it more palatable. We say that a vertex V *dominates* a | ||
vertex W if every path from the CFG entry to W goes through V, and that it | ||
strictly dominates W if V is distinct from W. By induction, every vertex V other | ||
than the CFG entry has an *immediate dominator* that does not dominate any other | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you mean "does not strictly dominate", right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's sort of implied, since I say 'does not dominate any other...', but maybe it would be more clear the way that you suggest. |
||
dominator of V. The immediate dominance relation forms a tree, and it is | ||
possible to efficiently compute the dominator tree from the CFG. For more | ||
details, see [the Wikipedia page](http://en.wikipedia.org/wiki/Dominator_%28graph_theory%29) | ||
or any textbook on compilers. | ||
|
||
We can then reformulate the definition above to say that R is a *single-entry / | ||
multiple-exit region* if the following conditions hold: | ||
|
||
1. R is a subtree (i.e. a connected subgraph) of the dominator tree. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is a very powerful idea. treating regions as subgraphs of the dominator graph could open up other optimisations and analyses - i think it would be worthwhile as a representation even without this specific motivation. |
||
|
||
2. If W is a nonroot vertex of R, and V -> W is an edge in the CFG such that V | ||
doesn't strictly dominate W, then V is in R. | ||
|
||
Since the intersection of subtrees is a subtree, it is clear from this | ||
definition that the intersection of two SEME regions is a SEME region, which | ||
provides the meet lattice operation. Given set S of vertices in the CFG, it is | ||
possible to define the least SEME region containing S by taking the intersection | ||
of all SEME regions containing S, or by an iterative worklist algorithm that | ||
tries to satisfy the two conditions above. By taking S to be the union of two | ||
SEME regions, this provides the join lattice operation. | ||
|
||
It is instructive to see that this definition implicitly handles the problems | ||
with loops discussed for SESE regions. Consider the program fragment | ||
|
||
```rust | ||
let mut a = Some(0u); | ||
let p = a.as_ref().unwrap(); | ||
loop { | ||
println!("{}", *p); | ||
a = None; | ||
if ... { | ||
break; | ||
} | ||
} | ||
``` | ||
|
||
If R is a SEME region rooted at the borrow used in the definition of `p` that | ||
contains the entry point of the loop, then it must contain the entire loop. This | ||
is because in the loop backedge V -> W, V doesn't strictly dominate W. You can | ||
use similar reasoning to inductively show that the entire loop is contained in R. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this would be clearer if it noted that this is for nodes W where the node is part of the other three lines in the loop |
||
|
||
Using SEME regions defined in this manner requires in the type checker storing | ||
regions as sets of vertices. While these sets may be optimized, e.g. by using | ||
sparse bitvectors or interval representations, they still have the potential for | ||
poor worst-case behavior. After the regionck phase of the compiler, we can then | ||
compute the exit edges of the region by just looking for edges whose origin is | ||
in the region but whose target is not, and use these exit edges for later | ||
dataflow analyses. | ||
|
||
Since SEME regions still have a single entry point, and error messages generally | ||
refer to "the region beginning at...", error messages won't need to change much | ||
or at all. Since SEME regions allow strictly more programs than the current | ||
behavior, this should never be confusing. | ||
|
||
# Drawbacks | ||
|
||
This is guaranteed to be an increase in complexity in the compiler, and it will | ||
probably slightly impact compile times. However, it does seem like there is a | ||
lot of room for constant-factor implementation improvements, and compilers | ||
already use similar data structures for register allocation on much larger | ||
functions after inlining. | ||
|
||
This extended notion of scope also doesn't fit into LLVM's [scope-based aliasing | ||
rules](http://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata). This | ||
will make informing LLVM about Rust-specific aliasing information more | ||
difficult, but it does seem possible to extend the LLVM scheme to handle regions | ||
like those defined here. | ||
|
||
# Alternatives | ||
|
||
1. We could just not do this, and keep borrows being lexical scopes. | ||
|
||
2. We could implement SESE regions first, since they are easier to compute and | ||
reason about. | ||
|
||
3. If it is possible to implement efficiently, we could use an alternative | ||
definition of a SEME region that is in terms of explicit entry and exit points. | ||
If one is possible, it likely involves the [tree decomposition](http://en.wikipedia.org/wiki/Tree_decomposition) | ||
of a directed graph. Such an algorithm would probably scale better than the one | ||
described here, but would be considerably more complex (and would require a tree | ||
decomposition as a prerequisite). | ||
|
||
4. We could go further and try to generalize this to multiple-entry / | ||
multiple-exit regions. This would enable variables to be defined with borrows | ||
on multiple branches of a conditional, e.g. | ||
|
||
```rust | ||
let mut i = 0i; | ||
let p = &mut i; | ||
|
||
let q: &mut int; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's unclear to me what this fragment is demonstrating. q is an immutable binding, so it has to be initialised once-and-only-once subsequently, right? is the idea that MEME would allow the q= lines in each branch to borrowcheck because they're each an entry to the same region (and each of the drop(p) is an exit from the previous region?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's the scenario I'm describing. It is initialized in two different program locations, but only once on every path through the program. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, that makes sense |
||
if ... { | ||
drop(p); | ||
q = &mut i; | ||
} else { | ||
drop(p) | ||
q = &mut i; | ||
} | ||
``` | ||
|
||
This doesn't really seem useful enough to warrant the complexity. It would also | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if so, i agree that it's a niche usage compared to what SEME enables. |
||
make creating good error messages more difficult. | ||
|
||
# Unresolved questions | ||
|
||
Does there exist an efficient sparse representation for SEME regions? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. half-formed representation ideas:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possibly changing the order of evaluation, e.g. changing
a.foo().bar(a.baz())
tolet tmp = a.baz(); a.foo().bar(tmp)
runs things in different orders.(It may be necessary to change it in this manner, since
foo
could return a mutable borrow ofa
.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the actual problem is presented properly, maybe @nikomatsakis could confirm this, but my intuition is that the auto-borrow of an lvalue receiver can be delayed to after the evaluation of the arguments, without breaking backwards compatibility or creating unsoundness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have to call
deref_mut
in order to borrow the receiver, then changing it from its current order will technically be backwards incompatible. Whether this ordering matters in practice is anyone's guess. Like destructor ordering of temporaries, it may be something that can be slightly tweaked in the future without breaking anyone's code.You could do something crazy like only pick a different order in the case where the program would've failed to compile under the current rules, but then you're adding an unpredictable inconsistency for the sake of strict backwards compatibility.
I realized this after I posted this RFC, and was already planning on updating the RFC soon to add a section that accounts for this. There isn't a great description of the current behavior besides the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling
deref_mut
would be auto-deref, not auto-borrow (which produces a&T
or&mut T
rvalue from aT
value), and given proper desugaring, analysis passes would seex.deref_mut().foo(x.deref().bar())
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The program
currently prints
with the receiver being deref'd first. If we want to accept as many programs as possible then the order will have to be changed to evaluate the arguments first, changing the order in which
deref_mut
gets called on each of the structs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in favor of having parameters evaluated before the receiver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you want to go with this specific evaluation order? A Google search did not produce any documentation on this.
Your workaround would work for simple cases. It will break apart unpredictably at more complex cases. Such unpredictably would really frustrate the users.
What about debug mode with optimizations disabled? Would it still reorder stuff?
What to do if you need a mutable borrow to evaluate parameters:
Or a mutable borrow to get to the function to call:
To generalize self is just another parameter. If a parameter needs to borrow obj after a mutable borrow is done, there is no way out.
We can either define an order that would work for most cases or leave it undefined and let compiler to figure out if there is a evaluation order that complies with borrow restrictions.
It's much more likely that self will need a mutable borrow of obj than the other parameters.
Thus I'm for evaluating self last to allow more code to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My ordering would allow strictly more programs to compile, it doesn't affect side effects, and it's quite deterministic.
Evaluating the receiver and the auto-borrow on it are not coupled. The evaluation includes side effects and reodering it would be disastrous, while an lvalue auto-borrow (and even an explicit borrow) can be placed anywhere between the evaluation of the lvalue and when it is actually needed (i.e. the argument is required for the call).
Moving the borrows later effectively shrinks the duration of the borrow, potentially removing overlaps with other borrows, but never adding new overlaps.
On top of that, we can also reorder just the evaluation of the autoderef on the receiver, not the entire receiver, spec'ing it as being evaluated as part of the second pass over the arguments. Even if restricting
Deref
impls to have no side-effects (using anunsafe trait
) to get this behavior is doable, I don't think it's necessary.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is getting out of my comfort zone. Possibly you are right.
Let me elaborate on the last example a bit:
This can be rewritten as
And if getSomethingMutablyBorrowsObj is something like this:
I understand that obj is mutably borrowed for lifetime of foo. The receiver in this case is foo and borrowing of obj has already happened while evaluating getSomething.