Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataflow analysis of enum variants #96991

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions compiler/rustc_mir_dataflow/src/framework/fmt.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
//! Custom formatting traits used when outputting Graphviz diagrams with the results of a dataflow
//! analysis.

use crate::lattice::{FactArray, FactCache};
use rustc_index::bit_set::{BitSet, ChunkedBitSet, HybridBitSet};
use rustc_index::vec::Idx;
use std::fmt;
Expand Down Expand Up @@ -124,6 +125,30 @@ where
}
}

impl<T: Clone + Eq + fmt::Debug, C, const N: usize> DebugWithContext<C> for FactArray<T, N>
where
T: DebugWithContext<C>,
{
fn fmt_with(&self, ctxt: &C, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_map()
.entries(
self.arr
.iter()
.enumerate()
.map(|(i, ref v)| (i, DebugWithAdapter { this: *v, ctxt })),
)
.finish()
}
}

impl<I: Eq + fmt::Debug, L: Eq + fmt::Debug, F: Eq + fmt::Debug, C, const N: usize>
DebugWithContext<C> for FactCache<I, L, F, N>
{
fn fmt_with(&self, _: &C, _: &mut fmt::Formatter<'_>) -> fmt::Result {
todo!();
}
}

fn fmt_diff<T, C>(
inserted: &HybridBitSet<T>,
removed: &HybridBitSet<T>,
Expand Down
193 changes: 193 additions & 0 deletions compiler/rustc_mir_dataflow/src/framework/lattice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -250,3 +250,196 @@ impl<T: Clone + Eq> MeetSemiLattice for FlatSet<T> {
true
}
}

macro_rules! packed_int_join_semi_lattice {
($name: ident, $base: ty) => {
#[derive(Debug, PartialEq, Eq, Copy, Clone, PartialOrd, Ord)]
pub struct $name($base);
impl $name {
pub const TOP: Self = Self(<$base>::MAX);
#[inline]
pub const fn new(v: $base) -> Self {
Self(v)
}

/// `saturating_new` will convert an arbitrary value (i.e. u32) into a Fact which
/// may have a smaller internal representation (i.e. u8). If the value is too large,
/// it will be converted to `TOP`, which is safe because `TOP` is the most
/// conservative estimate, assuming no information. Note, it is _not_ safe to
/// assume `BOT`, since this assumes information about the value.
#[inline]
pub fn saturating_new(v: impl TryInto<$base>) -> Self {
v.try_into().map(|v| Self(v)).unwrap_or(Self::TOP)
}

pub const fn inner(self) -> $base {
self.0
}
}

impl JoinSemiLattice for $name {
#[inline]
fn join(&mut self, other: &Self) -> bool {
match (*self, *other) {
(Self::TOP, _) => false,
(a, b) if a == b => false,
_ => {
*self = Self::TOP;
true
}
}
}
}

impl<C> crate::fmt::DebugWithContext<C> for $name {
fn fmt_with(&self, _: &C, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
if *self == Self::TOP { write!(f, "TOP") } else { write!(f, "{}", self.inner()) }
}
}
};
}

packed_int_join_semi_lattice!(PackedU8JoinSemiLattice, u8);

#[derive(Eq, PartialEq, Copy, Clone, Debug)]
pub struct FactArray<T, const N: usize> {
// FIXME(julianknodt): maybe map Idxs to each N element?
pub arr: [T; N],
}

impl<T, const N: usize> FactArray<T, N> {
#[inline]
pub fn insert(&mut self, i: impl Idx, fact: T) {
let Some(v) = self.arr.get_mut(i.index()) else { return };
*v = fact;
}
#[inline]
pub fn get(&self, i: &impl Idx) -> Option<&T> {
self.arr.get(i.index())
}
}

impl<T: JoinSemiLattice, const N: usize> JoinSemiLattice for FactArray<T, N> {
fn join(&mut self, other: &Self) -> bool {
let mut changed = false;
for (a, b) in self.arr.iter_mut().zip(other.arr.iter()) {
changed |= a.join(b);
}
changed
}
}

impl<T: MeetSemiLattice, const N: usize> MeetSemiLattice for FactArray<T, N> {
fn meet(&mut self, other: &Self) -> bool {
let mut changed = false;
for (a, b) in self.arr.iter_mut().zip(other.arr.iter()) {
changed |= a.meet(b);
}
changed
}
}

/// FactCache is a struct that contains `N` recent facts (of type F) from dataflow analysis,
/// where a fact is information about some component of a program, such as the possible values a
/// variable can take. Variables are indexed by `I: Idx` (i.e. mir::Local), and `L` represents
/// location/recency, so that when merging two fact caches, the more recent information takes
/// precedence.
/// This representation is used because it takes constant memory, and assumes that recent facts
/// will have temporal locality (i.e. will be used closed to where they are generated). Thus, it
/// is more conservative than a complete analysis, but should be fast.
#[derive(Eq, PartialEq, Copy, Clone, Debug)]
pub struct FactCache<I, L, F, const N: usize> {
facts: [F; N],
ord: [(I, L); N],
len: usize,
}
JulianKnodt marked this conversation as resolved.
Show resolved Hide resolved
Comment on lines +351 to +355
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the documentation update. What is the partial order under which this is a lattice? If this is a well-known result from theory (I'm not terribly familar) feel free to just provide a link to a reference

Copy link
Contributor Author

@JulianKnodt JulianKnodt Jul 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no theory, it's just that each fact gets inserted at a specific location ( body + stmt idx ), so I attempt to keep later facts more than more recent ones, like an LRU cache.

What I mean is that we could represent the entire set of facts using a HashMap<Idx, Value>, but this will grow linearly with the size of the program (or at least whatever is being analyzed). Instead, to use constant space, we keep a set of facts that evicts older ones, which we assume are no longer relevant, and replaces them with the most recent updates. This a strict subset of the HashMap<Idx, Value>, so should also be a lattice.

This comment was marked as outdated.

Copy link
Contributor

@JakobDegen JakobDegen Jul 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yeah, sorry, what I said before was silly. Yes, the subset partial order is an option, but it's not clear to me that that results in a lattice. Under the subset partial order the least upper bound of two states each storing N facts is a state that would have to store 2*N facts, which you can't represent

I should start thinking before I say things. Yeah, ok, this does seem like a lattice - let me go and re-read some stuff then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ok I thought about this a bunch more and I think it's ok. We can imagine this fact cache as implicitly storing

enum LocalResult {
    Top,
    Fact(F, L),
    Bot(L), 
}

for each I. Locals which do not have an entry in the fact cache are implicitly Top. The partial order within each LocalResult is then Bot < Fact(F, min) < ... < Fact(F, max) < Top for each F, and the partial order for the fact cache as a whole is then the product partial order. This does indeed form a lattice. This should be documented though.


impl<I: Idx, L: Ord + Eq + Copy, F, const N: usize> FactCache<I, L, F, N> {
pub fn new(empty_i: I, empty_l: L, empty_f: F) -> Self
where
F: Copy,
{
Self { facts: [empty_f; N], ord: [(empty_i, empty_l); N], len: 0 }
}
/// (nserts a fact into the cache, evicting the oldest one,
/// Or updating it if there is information on one already. If the new fact being
/// inserted is older than the previous fact, it will not be inserted.
pub fn insert(&mut self, i: I, l: L, fact: F) {
let mut idx = None;
for (j, (ci, _cl)) in self.ord[..self.len].iter_mut().enumerate() {
if *ci == i {
// if an older fact is inserted, still update the cache: i.e. cl <= l usually
// but this is broken during apply switch int edge effects, because the engine
// may choose an arbitrary order for basic blocks to apply it to.
idx = Some(j);
break;
}
}
if idx.is_none() && self.len < N {
let new_len = self.len + 1;
idx = Some(std::mem::replace(&mut self.len, new_len));
};
if let Some(idx) = idx {
self.facts[idx] = fact;
self.ord[idx] = (i, l);
return;
};
let (p, (_, old_l)) = self.ord.iter().enumerate().min_by_key(|k| k.1.1).unwrap();
// FIXME(julianknodt) maybe don't make this an assert but just don't update?
assert!(*old_l <= l);
self.ord[p] = (i, l);
self.facts[p] = fact;
}
pub fn get(&self, i: I) -> Option<(&L, &F)> {
let (p, (_, loc)) =
self.ord[..self.len].iter().enumerate().find(|(_, iloc)| iloc.0 == i)?;
Some((loc, &self.facts[p]))
}
pub fn remove(&mut self, i: I) -> bool {
let Some(pos) = self.ord[..self.len].iter().position(|(ci, _)| *ci == i)
else { return false };

self.remove_idx(pos);
return true;
}
#[inline]
fn remove_idx(&mut self, i: usize) {
assert!(i < self.len);
self.ord.swap(i, self.len);
self.facts.swap(i, self.len);
self.len -= 1;
}

fn drain_filter(&mut self, mut should_rm: impl FnMut(&I, &mut L, &mut F) -> bool) {
let mut i = 0;
while i < self.len {
let (idx, l) = &mut self.ord[i];
let f = &mut self.facts[i];
if should_rm(idx, l, f) {
self.remove_idx(i);
continue;
}
i += 1;
}
}
}

impl<I: Idx, L: Ord + Eq + Copy, F: Eq, const N: usize> JoinSemiLattice for FactCache<I, L, F, N> {
fn join(&mut self, other: &Self) -> bool {
let mut changed = false;
self.drain_filter(|i, l, f| {
let Some((other_loc, other_fact)) = other.get(*i) else {
changed = true;
return true;
};
if other_fact == f {
*l = (*l).max(*other_loc);
return false;
}
changed = true;
return true;
});

changed
}
}
2 changes: 2 additions & 0 deletions compiler/rustc_mir_dataflow/src/impls/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,15 @@ use crate::{lattice, AnalysisDomain, GenKill, GenKillAnalysis};
mod borrowed_locals;
mod init_locals;
mod liveness;
mod single_enum_variant;
mod storage_liveness;

pub use self::borrowed_locals::borrowed_locals;
pub use self::borrowed_locals::MaybeBorrowedLocals;
pub use self::init_locals::MaybeInitializedLocals;
pub use self::liveness::MaybeLiveLocals;
pub use self::liveness::MaybeTransitiveLiveLocals;
pub use self::single_enum_variant::SingleEnumVariant;
pub use self::storage_liveness::{MaybeRequiresStorage, MaybeStorageLive};

/// `MaybeInitializedPlaces` tracks all places that might be
Expand Down
Loading