Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Single-entry / multiple-exit regions for borrows #396

Closed
wants to merge 2 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
307 changes: 307 additions & 0 deletions active/0000-seme-regions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,307 @@
- Start Date: 2014-10-12
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

The regions used for lifetime inference and borrow checking should be general
single-entry / multiple-exit regions, instead of being limited to the current
situation of lexical scopes, or even more general single-entry / single-exit
regions.

# Motivation

There are three classes of program fragments that are currently rejected by the
borrow checker that really seem like they should be accepted.

1. The borrow checker ignores liveness of borrows, so that a borrow is valid for
the entirety of a lexical scope, even after its last use. This even occurs in
straight-line code For example, a program like this is rejected:

```rust
let mut i = 0i;
let p = &mut i;
*p = 1;
let q = &mut i;
*q = 2;
```

The borrow checker should be able to realize that each of these borrows is no
longer valid at the point of the next borrow. The workaround for this is to give
each borrow its own lexical scope, e.g.

```rust
let mut i = 0i;
{
let p = &mut i;
*p = 1;
}
{
let q = &mut i;
*q = 2;
}
```

This is unnecesarily verbose and difficult to explain to new programmers. It
also doesn't handle all cases involving mutable borrows of multiple locations.

2. The borrow checker can't refine borrow information based on the branches of
a pattern match. The canonical example is a 'find-or-insert' pattern:

```rust
match map.find(&key) {
Some(...) => { ... }
None => {
map.insert(key, new_value);
}
}
```

In an `Option<&T>` the `&T` only appears in the the `Some` constructor, so the
borrow shouldn't need to be considered in the `None` branch. The workaround for
this is usually to do a clone/copy and two separate lookups'. This is more
verbose, doesn't work in all cases, and is potentially a large performance
penalty.

3. Borrows in rvalues of the same overall expression often overlap when it
doesn't seem that they have to. For example, if `a` has methods `foo`, `bar`
and `baz` that all take `&mut self` parameters, but the latter two have no
lifetimes in output position, then an expression such as

```rust
a.foo(a.bar(), a.baz())
```

seems like it should be valid. The workaround is to instead write

```rust
let bar = a.bar();
let baz = a.baz();
a.foo(bar, baz);
```

All this is doing is making the evaluation of temporaries explicit, so it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possibly changing the order of evaluation, e.g. changing a.foo().bar(a.baz()) to let tmp = a.baz(); a.foo().bar(tmp) runs things in different orders.

(It may be necessary to change it in this manner, since foo could return a mutable borrow of a.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the actual problem is presented properly, maybe @nikomatsakis could confirm this, but my intuition is that the auto-borrow of an lvalue receiver can be delayed to after the evaluation of the arguments, without breaking backwards compatibility or creating unsoundness.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have to call deref_mut in order to borrow the receiver, then changing it from its current order will technically be backwards incompatible. Whether this ordering matters in practice is anyone's guess. Like destructor ordering of temporaries, it may be something that can be slightly tweaked in the future without breaking anyone's code.

You could do something crazy like only pick a different order in the case where the program would've failed to compile under the current rules, but then you're adding an unpredictable inconsistency for the sake of strict backwards compatibility.

I realized this after I posted this RFC, and was already planning on updating the RFC soon to add a section that accounts for this. There isn't a great description of the current behavior besides the implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling deref_mut would be auto-deref, not auto-borrow (which produces a &T or &mut T rvalue from a T value), and given proper desugaring, analysis passes would see x.deref_mut().foo(x.deref().bar()).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The program

struct A;
struct B {
    label: uint,
    a: A,
}

impl A {
    fn foo(&self, other1: (), other2: ()) { }
    fn bar(&self) { }
}

impl Deref<A> for B {
    fn deref(&self) -> &A {
        println!("deref: {}", self.label);
        &self.a
    }
}

fn main() {
    let b1 = B {
        label: 1,
        a: A,
    };
    let b2 = B {
        label: 2,
        a: A,
    };
    let b3 = B {
        label: 3,
        a: A,
    };

    b1.foo(b2.bar(), b3.bar());
}

currently prints

deref: 1
deref: 2
deref: 3

with the receiver being deref'd first. If we want to accept as many programs as possible then the order will have to be changed to evaluate the arguments first, changing the order in which deref_mut gets called on each of the structs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in favor of having parameters evaluated before the receiver.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 // doesn't work today
vec.push(vec[5])
 // making the receiver mut borrow explicit
Vec::push(&mut vec, *Index::index(&vec, &5))
// now let's use imaginary syntax for what happens in the compiler
lvalue vec = ...;
rvalue a = &mut vec;
rvalue b = &vec; // error, vec is mutably borrowed by x
rvalue c = &5
rvalue d = Index::index(b, c);
lvalue e = *d;
rvalue f = e; // copies the value from the vector
Vec::push(a, f);
// we can observe that evaluating `a` has no side-effect, thus
// it can be reordered after normal evaluation.
lvalue vec = ...;
lvalue a = vec;
rvalue b = &vec; // immutable borrow of vec starts
rvalue c = &5
rvalue d = Index::index(b, c);
lvalue e = *d;
rvalue f = e; // copies the value from the vector
// the borrow of vec through b ends here
rvalue a' = &mut a;
rvalue f' = f;
Vec::push(a', f');

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you want to go with this specific evaluation order? A Google search did not produce any documentation on this.

Your workaround would work for simple cases. It will break apart unpredictably at more complex cases. Such unpredictably would really frustrate the users.
What about debug mode with optimizations disabled? Would it still reorder stuff?

What to do if you need a mutable borrow to evaluate parameters:

obj.paint(obj.x, obj.y, obj.getCachedOrDownloadImage())

Or a mutable borrow to get to the function to call:

obj.getSomethingMutablyBorrowsObj().someFunction(obj.name)  

To generalize self is just another parameter. If a parameter needs to borrow obj after a mutable borrow is done, there is no way out.

We can either define an order that would work for most cases or leave it undefined and let compiler to figure out if there is a evaluation order that complies with borrow restrictions.

It's much more likely that self will need a mutable borrow of obj than the other parameters.
Thus I'm for evaluating self last to allow more code to work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My ordering would allow strictly more programs to compile, it doesn't affect side effects, and it's quite deterministic.
Evaluating the receiver and the auto-borrow on it are not coupled. The evaluation includes side effects and reodering it would be disastrous, while an lvalue auto-borrow (and even an explicit borrow) can be placed anywhere between the evaluation of the lvalue and when it is actually needed (i.e. the argument is required for the call).
Moving the borrows later effectively shrinks the duration of the borrow, potentially removing overlaps with other borrows, but never adding new overlaps.

On top of that, we can also reorder just the evaluation of the autoderef on the receiver, not the entire receiver, spec'ing it as being evaluated as part of the second pass over the arguments. Even if restricting Deref impls to have no side-effects (using an unsafe trait) to get this behavior is doable, I don't think it's necessary.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is getting out of my comfort zone. Possibly you are right.

Let me elaborate on the last example a bit:

obj.getSomethingMutablyBorrowsObj().someFunction(obj.name)  

This can be rewritten as

let &mut  foo = obj.getSomethingMutablyBorrowsObj();
foo.someFunction(obj.name);

And if getSomethingMutablyBorrowsObj is something like this:

 fn getSomethingMutablyBorrowsObj(&'a mut self) -> &'a mut Something {
    self.a ? self.obj1 : self.obj2
 }

I understand that obj is mutably borrowed for lifetime of foo. The receiver in this case is foo and borrowing of obj has already happened while evaluating getSomething.

shouldn't affect whether the program is accepted.

While all of these errors arise in the borrow checker, they are actually caused
by a deficiency in Rust's lifetime system. Currently, Rust models lifetimes as
lexical scopes, and borrows must last for a specific lifetime.

The way to fix these problems is to refine the notion of lifetime. The obvious
first step would be to generalize from lexical scopes to arbitrary single-entry
/ single-exit regions, but this would only fix the first and third of the
problems mentioned above. To fix the second problem, which is arguably the most
frustrating in actual use because workarounds are difficult or nonexistent, the
notion of lifetimes in Rust needs to be refined to include multiple exit points.

# Background

Unfortunately, there is no obvious candidate for a notion of a single-entry /
multiple-exit region to use from the literature. We assume that a Rust function
is viewed as a control-flow graph, with a single entry point and a single return
point (multiple return points can be handled with additional edges). This
includes the expansion of temporary rvalue evaluation. This is already done for
other reasons in `rustc`.

It's easier to first think about the single-entry / single-exit case first. If R
is a set of vertices in a CFG, the most obvious definition is that R is a
single-entry / single-exit region if the following conditions hold:

1. There is a vertex Entry in R such that for all vertices V in R, every path
from the function entry to V contains Entry.

2. There is a vertex Exit in R such that for all vertices V in R, every path
from V to the function exit contains Exit.

If we were to incorporate this definition of regions into Rust lifetimes, then
the following program fragment would be accepted:

```rust
let mut a = Some(0u);
let p = a.as_ref().unwrap();
loop {
println!("{}", *p);
a = None;
if ... {
break;
}
}
```

The lifetime of the borrow associated with `p` would be a single-entry /
single-exit region with Entry at the call to `as_ref` and Exit at the last
use of `p`. Therefore the borrow checker would find nothing wrong with the
mutation right below the last use, even though this leads to memory unsafety
at runtime.

The problem here is that the region's Entry and Exit aren't always matched, i.e.
there are paths from the function entry to the function exit that go through
Entry a different number of times than they go through Exit. If we add this
condition, then we arrive at the standard definition of a SESE region in the
compiler literature. This is equivalent to saying that Entry and Exit belong
to the same cycles in the CFG.

Adopting SESE regions for Rust lifetimes would be sound. Moreover, SESE regions
form a lattice, where meet is intersection and join is the least region
containing the union of two regions. Since a single vertex forms its own region,
this also covers extending a region to contain a point. Since these are the
operations that the Rust type checker needs to infer lifetimes, it would be
possible to compute lifetimes in this manner without much difficulty.

It is possible to extend this notion of a region to have multiple exit points
instead of just one, but the details of the definition are a bit subtle (e.g.
we would have to track exit edges rather than exit vertices), and unfortunately
it becomes very difficult to compute the lattice operations given a
representation of two regions in terms of entry and exit points. The problem is
that when shrinking or extending a region, there is no easy way to compute the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is shrinking/extending part of the lifetime analysis implementation in rustc? if it was instead implemented to gather entry/exit points then solve them all at once, would this make the trivial extension (defining exits as graph edges, rather than dominated vertices) feasible?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the lifetime inference algorithm requires extending regions to include a new point, taking meet and join of regions, etc. Any lifetime inference algorithm based on any of the standard approaches to inference with subtyping will require this.

new exit edges.

We don't actually need the exit edges of regions in the Rust compiler until
after type checking, when the dataflow analysis used for borrow checking inserts
borrow kill flags on exits from regions. In the type checker itself, we could
use any representation we want. Because of the difficulty of finding an
efficient sparse representation in terms of entry and exit points, this RFC
proposes a definition of single-entry / multiple-exit region that doesn't use
exit points at all.

# Detailed Design

This definition is inspired somewhat by the work done on liveness analysis and
register allocation of programs in [static single assignment form](http://en.wikipedia.org/wiki/Static_single_assignment_form).

If R is a set of vertices in the CFG, we say that R is a *single-entry /
multiple-exit region* if the following conditions hold:

1. There is a vertex Entry in R such that for all vertices V in R, every path
from the function entry to V contains Entry.

2. If V is in R and P is a path from Entry to V that does not leave R and
reenter through Entry, then every vertex of P is in R.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds tautological. "If [...] P is a path [...] that does not leave R [...], then every vertex of P is in R". Surely the condition "P is a path [...] that does not leave R" is identical to "P is a path where every vertex of P is in R", which makes this "If P is a path where every vertex of P is in R, then every vertex of P is in R".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first came up with the more technical definition below and then tried to make a plain English counterpart.

What I was attempting to capture was the part about reentering through Entry. For example, in a program fragment like

loop {
    [Entry]
    [Exit]
}

there is allowed to be a path from Entry to Exit, back around the outer loop that reenters Entry, without all of the vertices on that path being in the region. On second thought, it seems that this would be fine:

If V is in R and P is a path from Entry to V that does not contain Entry multiple times, then every vertex of P is in R.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of counting uses of Entry seems a little odd. Could you not use something like the following:

For every pair of vertices V and U in R, every path from V to U that contains vertices not in R also contains Entry.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems equivalent and a bit simpler, so I'll go with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or the more general definition, which subsumes the first condition:

For all vertices V in R, for all vertices U not in R, every path from U to V contains Entry.

Instead of evaluating paths only from the CFG entry to V, evaluating from all nodes not in R allows you to avoid doing any counting.


The second condition is to ensure that in the case of a program fragment like

```rust
let mut i = 0i;
let p = &mut i;
if ... {
(use of p)
} else {
(no use of p)
}
drop(p);
```

that the region corresponding to the borrow in `p` includes both branches of the
`if`.

This definition is difficult to work with directly, but it is possible to
reformulate it to make it more palatable. We say that a vertex V *dominates* a
vertex W if every path from the CFG entry to W goes through V, and that it
strictly dominates W if V is distinct from W. By induction, every vertex V other
than the CFG entry has an *immediate dominator* that does not dominate any other

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean "does not strictly dominate", right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's sort of implied, since I say 'does not dominate any other...', but maybe it would be more clear the way that you suggest.

dominator of V. The immediate dominance relation forms a tree, and it is
possible to efficiently compute the dominator tree from the CFG. For more
details, see [the Wikipedia page](http://en.wikipedia.org/wiki/Dominator_%28graph_theory%29)
or any textbook on compilers.

We can then reformulate the definition above to say that R is a *single-entry /
multiple-exit region* if the following conditions hold:

1. R is a subtree (i.e. a connected subgraph) of the dominator tree.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very powerful idea. treating regions as subgraphs of the dominator graph could open up other optimisations and analyses - i think it would be worthwhile as a representation even without this specific motivation.


2. If W is a nonroot vertex of R, and V -> W is an edge in the CFG such that V
doesn't strictly dominate W, then V is in R.

Since the intersection of subtrees is a subtree, it is clear from this
definition that the intersection of two SEME regions is a SEME region, which
provides the meet lattice operation. Given set S of vertices in the CFG, it is
possible to define the least SEME region containing S by taking the intersection
of all SEME regions containing S, or by an iterative worklist algorithm that
tries to satisfy the two conditions above. By taking S to be the union of two
SEME regions, this provides the join lattice operation.

It is instructive to see that this definition implicitly handles the problems
with loops discussed for SESE regions. Consider the program fragment

```rust
let mut a = Some(0u);
let p = a.as_ref().unwrap();
loop {
println!("{}", *p);
a = None;
if ... {
break;
}
}
```

If R is a SEME region rooted at the borrow used in the definition of `p` that
contains the entry point of the loop, then it must contain the entire loop. This
is because in the loop backedge V -> W, V doesn't strictly dominate W. You can
use similar reasoning to inductively show that the entire loop is contained in R.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be clearer if it noted that this is for nodes W where the node is part of the other three lines in the loop


Using SEME regions defined in this manner requires in the type checker storing
regions as sets of vertices. While these sets may be optimized, e.g. by using
sparse bitvectors or interval representations, they still have the potential for
poor worst-case behavior. After the regionck phase of the compiler, we can then
compute the exit edges of the region by just looking for edges whose origin is
in the region but whose target is not, and use these exit edges for later
dataflow analyses.

Since SEME regions still have a single entry point, and error messages generally
refer to "the region beginning at...", error messages won't need to change much
or at all. Since SEME regions allow strictly more programs than the current
behavior, this should never be confusing.

# Drawbacks

This is guaranteed to be an increase in complexity in the compiler, and it will
probably slightly impact compile times. However, it does seem like there is a
lot of room for constant-factor implementation improvements, and compilers
already use similar data structures for register allocation on much larger
functions after inlining.

This extended notion of scope also doesn't fit into LLVM's [scope-based aliasing
rules](http://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata). This
will make informing LLVM about Rust-specific aliasing information more
difficult, but it does seem possible to extend the LLVM scheme to handle regions
like those defined here.

# Alternatives

1. We could just not do this, and keep borrows being lexical scopes.

2. We could implement SESE regions first, since they are easier to compute and
reason about.

3. If it is possible to implement efficiently, we could use an alternative
definition of a SEME region that is in terms of explicit entry and exit points.
If one is possible, it likely involves the [tree decomposition](http://en.wikipedia.org/wiki/Tree_decomposition)
of a directed graph. Such an algorithm would probably scale better than the one
described here, but would be considerably more complex (and would require a tree
decomposition as a prerequisite).

4. We could go further and try to generalize this to multiple-entry /
multiple-exit regions. This would enable variables to be defined with borrows
on multiple branches of a conditional, e.g.

```rust
let mut i = 0i;
let p = &mut i;

let q: &mut int;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's unclear to me what this fragment is demonstrating. q is an immutable binding, so it has to be initialised once-and-only-once subsequently, right? is the idea that MEME would allow the q= lines in each branch to borrowcheck because they're each an entry to the same region (and each of the drop(p) is an exit from the previous region?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the scenario I'm describing. It is initialized in two different program locations, but only once on every path through the program.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that makes sense

if ... {
drop(p);
q = &mut i;
} else {
drop(p)
q = &mut i;
}
```

This doesn't really seem useful enough to warrant the complexity. It would also

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, i agree that it's a niche usage compared to what SEME enables.

make creating good error messages more difficult.

# Unresolved questions

Does there exist an efficient sparse representation for SEME regions?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

half-formed representation ideas:

  • tree branch-choice lists, ala huffman coding
  • pointers back to the control flow graph for each exit, tagged with their common "up-tree" nodes (the greatest common of which would be the entry)
  • take advantage of the fact that only certain CFG node types can be region entry/exit points and thin out the dominator graph itself!