Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert redundant_clone to an analysis pass #11364

Closed
wants to merge 1 commit into from

Conversation

Jarcho
Copy link
Contributor

@Jarcho Jarcho commented Aug 20, 2023

fixes #10940
fixes #10893
fixes #10870
fixes #10577
fixes #10545
fixes #10517

This is mostly finished, but blocked until rust-lang/rust#114900 is merged (or something similar). This isn't worth reviewing until then.

changelog:

@rustbot
Copy link
Collaborator

rustbot commented Aug 20, 2023

r? @Alexendoo

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Aug 20, 2023
Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments and questions, to gain some perspective on the use case of rust-lang/rust#114900.
My impression is that you are trying to bend the dataflow API to do too much at once.
This analysis could be much simpler, and not require any change to the dataflow API.

Comment on lines 115 to 137
let mut ref_target_results = RefTargetAnalysis::new(cx.tcx, cx.param_env, body, &raw_ptr_taken)
.into_engine(cx.tcx, body)
.iterate_to_fixpoint()
.into_results_cursor(body);
let clone_sources = clone_calls
.iter()
.filter_map(|block| {
let data = &body.basic_blocks[block];
ref_target_results.seek_before_primary_effect(Location {
block,
statement_index: data.statements.len(),
});
if let TerminatorKind::Call { args, .. } = &data.terminator().kind
&& let [Operand::Copy(arg) | Operand::Move(arg), ..] = &**args
&& let Some(Some(targets)) = ref_target_results.get().targets.get(&arg.local)
{
Some((block, (arg.local, targets.clone())))
} else {
None
}
})
.collect::<FxHashMap<_, _>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using the SSA analysis to do this. The self-argument to the clone call will typically be a temporary created by autoref desugaring, so will be SSA. It will be much faster than a dataflow analysis for typical MIR bodies.
See rustc_mir_transform::ssa::SsaLocals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically yes. If there ends up being perf problem this would be worth not linting the few extra cases this could catch. Limiting the analysis to just types which are cloned could also get a decent speed up

struct CloneDomain {
links: FxHashSet<(Local, Local, BasicBlock)>,
values: FxHashMap<Local, Value>,
candidates: Candidates,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field should not be part of the analysis domain, but computed in post-processing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would work.

#[derive(Debug, Clone)]
struct Value {
last_modified: Option<BasicBlock>,
lint_unread: HybridBitSet<BasicBlock>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field looks like it attempts to model a liveness analysis. Can it be replaced by MayleLiveLocals that we already have in store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tracking if a read has occurred since the paired value was first modified.

Comment on lines 241 to 201
links: FxHashSet<(Local, Local, BasicBlock)>,
values: FxHashMap<Local, Value>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, you are trying to compute which locals store the same value.
Can this be achieved by using an IndexVec<Local, FlatSet<Location>>?
FlatSet::Bottom means uninitialized / moved-out.
FlatSet::Elem(location) means "assigned some unique value created at this location".
FlatSet::Top means "could store anything".
And in case of an assignment, propagate the FlatSet value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just which locals are duplicates of each other. This also needs to track how the duplicate values are used in relation to each other. We also need to know where the duplicate was made in order to lint that span.

@Jarcho
Copy link
Contributor Author

Jarcho commented Aug 27, 2023

Thanks for taking a look. Assuming pulling Candidates out of the analysis works that does unblock this.

It would still be nice to be able to change the links stored in the domain to a bitset and have the analysis convert between the (Local, Local, Block) triplet and the index. Doing so would require the join to have access to the analysis as it needs the triplet in order to work.


Just as further elaboration the basic idea for the lint it starts from the clone call and tracks both values. When one of the two values is modified, it then switches to checking if the non-modified version is read. If it is, then the clone is considered necessary. e.g.

// `x` is assigned a duplicate value of `y`
// a new value in `links` keeps track of the relation and the clone location
let x = y.clone(); 

// all values linked to `y` have `lint_unread` modified here 
y.modify();

// since `y` doesn't have the original value, the clone is necessary
// any clone call in `x`'s `lint_unread` field is a necessary clone
x.read();

If the order of the read and write are swapped the clone call isn't necessary since the read could have occurred on the original value.

Copy link
Contributor Author

@Jarcho Jarcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjgillot I've made some changes to the analysis. Though not strictly required, the changes to the dataflow api would still be useful.

#[derive(Default, Debug, Clone, PartialEq, Eq)]
struct CloneDomain {
// Values which are clones of each other and may still contain the same value.
links: FxHashSet<(Local, Local, BasicBlock)>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The links would be better as a BitSet (one without a fixed domain size) by storing an IndexSet in the analysis and using each value's index as the BitSet index. This causes a problem with the join as it needs access to the value.

Comment on lines +150 to +168
for block in (0..body.basic_blocks.len()).map(BasicBlock::from_usize) {
let state = results.entry_set_for_block(block);
let loc = block.start_location();
for (&local, _) in state.values.iter().filter(|(_, v)| v.last_modified.is_none()) {
for &(l1, l2, block) in &state.links {
let local = if l1 == local {
l2
} else if l2 == local {
l1
} else {
continue;
};

if terminator.source_info.span.from_expansion() {
continue;
if !borrowers.bounded_borrowers(&[], &[], local, loc) {
required_clones.insert(block);
}
}
}
}
Copy link
Contributor Author

@Jarcho Jarcho Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The required_clones.insert(..) part should occur inside the join, but the PossibleBorrowerMap can't be accessed from there. This would allow the whole loop to be removed.

clone_sources: &'a FxHashMap<BasicBlock, (Local, HybridBitSet<Local>)>,
link_buf: Vec<(Local, Local, BasicBlock)>,
borrowers: PossibleBorrowerMap<'mir, 'tcx>,
required_clones: FxHashSet<BasicBlock>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be calculated in a ResultsVisitor thereby resolving the earlier issue with PossibleBorrowerMap, but that requires the visitor to detect reads and moves the same way the analysis does.

Given that handling each read is a small amount of work, and using a visitor requires a whole extra pass over every block it's probably better to do it this way. It's also just less work to implement/maintain.

struct Value {
// Two values which were last modified at different locations may contain different values.
// Joining two such values needs to be treated as a modification.
last_modified: Option<Location>,
Copy link
Contributor Author

@Jarcho Jarcho Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really should be something like:

enum Index {
    Statement(usize),
    Merge(Local),
}
struct Location {
    block: BasicBlock,
    index: Index,
}

This would allow distinguishing different merged values from each other, but would require join to know which block it's working on.


To illustrate the problem this causes:

// `x` is merged value (i.e. location = None)
let mut x = if _ { foo() } else { bar() };
// `y` is merged value (i.e. location = None)
let y = if _ { foo2() } else { bar2() };

let z = x.clone();

// assuming the match doesn't modify `x`, `y`, and `z`, then the join when merging
// the arms shouldn't detect a modification
match _ {
    _ => { ... },
    _ => { ... },
    _ => { ... },
};

// here the clone isn't needed.
do_something(&z);

// but this needs to be detected as a modification of `x` when joining the states
if _ {
    // note `x` is not considered modified here as the value is overwritten.
    // it's only considered modified when joining the state after the `if`.
    x = y;
}

// here the clone is needed.
do_something(&z);

We're currently in a state where merging None with None needs to be considered both a modification and not a modification. The only way out of this is to know both where the join occurred, and which value was joined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would represent a value with a simple index. A first pass collects value roots. Those would be the value of a variable where it is borrowed before cloning.

The state would be something like IndexVec<Local, FlatSet<Value>>.

let mut x = if _ { foo() } else { bar() };
let y = if _ { foo2() } else { bar2() };

// `x` is borrowed here. `x` is `FlatSet::Elem(Value(0))` here.
let z = x.clone();
// `z` is `FlatSet::Elem(Value(0))`.

// assuming the match doesn't modify `x`, `y`, and `z`, then the join when merging
// the arms shouldn't detect a modification
match _ {
    _ => { ... },
    _ => { ... },
    _ => { ... },
};
// `x` and `z` are still both `FlatSet::Elem(Value(0))`

// here the clone isn't needed: `x` and `y` are both `FlatSet::Elem(Value(0))`.
do_something(&z);

if _ {
    x = y;
    // `x` would be `FlatSet::Top` if `y` is not tracked.
}
// By joining, `x` is `FlatSet::Top`.

// here the clone is needed: `x` is `FlatSet::Top`, while `z` is `FlatSet::Elem(Value(0))`
do_something(&z);

Copy link
Contributor Author

@Jarcho Jarcho Oct 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wouldn't handle cloning a value twice, nor would it handle a chain of clones. e.g.

let x = foo();
let y = if bar {
    x.clone()
} else {
    x.clone()
};
let z = y.clone();
// `x`, `y` and `z` should all be seen as the same value here.

Edit: The clone chain isn't a problem. If the source is already assigned a value, then that one can be used instead of giving a new one. This doesn't solve the problem with clones on different branches though.

None
#[derive(Default, Debug, Clone)]
struct RefTargetDomain {
targets: FxHashMap<Local, Option<HybridBitSet<Local>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that this represents, for each local, the locals it may point to. Is that so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct.

};
state.targets.insert(dst, targets);
} else if let Rvalue::Use(Operand::Move(src))
| Rvalue::Ref(_, BorrowKind::Mut { .. }, src) = a.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you treat &mut as a move.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an error. It should be setting the value to None, not removing it.

Some(targets)
} else {
None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about reborrows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently handled.

None
}
})
.collect::<FxHashMap<_, _>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this represents, for each block, "this local is a clone one of those locals".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping is block -> possible clone sources. The argument local doesn't strictly need to be there.

targets: FxHashMap<Local, Option<HybridBitSet<Local>>>,
}
impl JoinSemiLattice for RefTargetDomain {
fn join(&mut self, other: &Self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is the structure of this lattice. Each Local is independent, so we want a lattice structure for Option<Option<HybridBitSet>>.
Is it None <= Some(Some(lattice)) <= Some(None)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. Basically uninit < local < maybe non-local

struct Value {
// Two values which were last modified at different locations may contain different values.
// Joining two such values needs to be treated as a modification.
last_modified: Option<Location>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would represent a value with a simple index. A first pass collects value roots. Those would be the value of a variable where it is borrowed before cloning.

The state would be something like IndexVec<Local, FlatSet<Value>>.

let mut x = if _ { foo() } else { bar() };
let y = if _ { foo2() } else { bar2() };

// `x` is borrowed here. `x` is `FlatSet::Elem(Value(0))` here.
let z = x.clone();
// `z` is `FlatSet::Elem(Value(0))`.

// assuming the match doesn't modify `x`, `y`, and `z`, then the join when merging
// the arms shouldn't detect a modification
match _ {
    _ => { ... },
    _ => { ... },
    _ => { ... },
};
// `x` and `z` are still both `FlatSet::Elem(Value(0))`

// here the clone isn't needed: `x` and `y` are both `FlatSet::Elem(Value(0))`.
do_something(&z);

if _ {
    x = y;
    // `x` would be `FlatSet::Top` if `y` is not tracked.
}
// By joining, `x` is `FlatSet::Top`.

// here the clone is needed: `x` is `FlatSet::Top`, while `z` is `FlatSet::Elem(Value(0))`
do_something(&z);

@bors
Copy link
Contributor

bors commented Oct 21, 2023

☔ The latest upstream changes (presumably #11694) made this pull request unmergeable. Please resolve the merge conflicts.

@xFrednet
Copy link
Member

xFrednet commented Mar 29, 2024

Hey, this is a ping from triage, since there hasn't been any activity in some time. Are you still planning to continue this implementation?

If you have any questions, you're always welcome to ask them in this PR or on Zulip.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels Mar 29, 2024
@xFrednet
Copy link
Member

Hey this is triage, I'm closing this due to inactivity. If you want to continue this implementation, you're welcome to create a new PR. Thank you for the time, you already put into this!

Interested parties are welcome to pick this implementation up as well :)

@rustbot label +S-inactive-closed -S-waiting-on-author -S-waiting-on-review

@xFrednet xFrednet closed this Jun 20, 2024
@rustbot rustbot added S-inactive-closed Status: Closed due to inactivity and removed S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) labels Jun 20, 2024
@apiraino
Copy link

(t-compiler triage 👋 )

I just noticed this PR was closed for inactivity. IIUC this was part of the work on rust#114900 (which admittedly was opened long ago). I'll try to get eyes on that PR to clarify its status and - by extension - this one, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment