-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge_batcher: use Rust iterators and VecDeque
in place of VecQueue
#380
merge_batcher: use Rust iterators and VecDeque
in place of VecQueue
#380
Conversation
The standard library offers `VecDeque` which also has cheap (i.e non allocating) conversions to and from `Vec`s so we can use that directly and reduce the number of unsafe calls in the project. Note that while all uses of the API of `VecQueue` were correct its methods were marked as safe but you could accidentally cause undefined behavior: ```rust let queue = VecQueue::new(); queue.pop(); // UB, length isn't checked ``` Signed-off-by: Petros Angelatos <[email protected]>
99bfc15
to
032a4e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I'm inclined to follow it with some work to remove push_unchecked
. It seems entirely plausible that push_unchecked
isn't needed at the moment. I suspect it existed to support alternate inner loops that don't recheck result.capacity()
v result.len()
each iteration.
…imelyDataflow#380) The standard library offers `VecDeque` which also has cheap (i.e non allocating) conversions to and from `Vec`s so we can use that directly and reduce the number of unsafe calls in the project. Note that while all uses of the API of `VecQueue` were correct its methods were marked as safe but you could accidentally cause undefined behavior: ```rust let queue = VecQueue::new(); queue.pop(); // UB, length isn't checked ``` Signed-off-by: Petros Angelatos <[email protected]>
#[inline] | ||
pub fn peek(&self) -> &T { | ||
debug_assert!(self.head < self.tail); | ||
unsafe { self.list.get_unchecked(self.head) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This original code is UB since it is using get_unchecked
to index out of bounds of the vector's contents. The vector this is indexing into has length 0 due to the set_len
below.
Since rust-lang/rust#116915 (Rust 1.76+, nightly-2023-12-05+), this UB is going to cause pretty much anything calling this code in differential-dataflow 0.12.0 to miscompile. I found this by way of a miscompile of the cargo-tally crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the subtext here that it would help to ship 0.13.0? I think 0.12.0 is multiple years old at this point, and perhaps it's worth a new, utterly incompatible version that has less unsafe
in it? (I/we mostly use the repo head rather than anything crates.io; you might be one of the few who do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not blocked on an official 0.13.0 being cut since I grabbed the code from master yesterday and published it to https://crates.io/crates/differential-dataflow-master myself. Cargo-tally can use it like this: dtolnay/cargo-tally@24a1946. For my purposes, this is completely fine — it lets you folks keep developing against head and lets me use snapshots of head in my crate whenever I like.
If you do 0.13.0 it would mostly be for the benefit of anyone else using differential-dataflow from crates.io. Right now 0.12.0 is effectively unusable, I think, and that was worth calling your attention to (also because the PR description says "all uses of the API were correct" and did not identify that this code is really problematic). But it does not need to translate into urgency to publish 0.13.0.
I had originally looked into whether backporting this single fix onto 0.12.0 as a 0.12.1 release would fix the miscompiles. I first confirmed that the commit in which this PR merged (6ae61ad) does not have the miscompile whereas its parent commit (cc88469) still does, so this is definitely the relevant fix. I backported it onto 7bc5338 (the v0.12.0 tag) as dtolnay-contrib@46edd1f. However that still miscompiled. There must be some other invalid unsafe code elsewhere that got removed earlier than this PR — you mentioned overall there is less unsafe
now compared to 3 years ago. I figured finding all the rest of what would need to be backported was not worthwhile so this is when I switched to the approach of publishing the code from master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We merged #413 recently, which also removes some unsafe code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also because the PR description says "all uses of the API were correct" and did not identify that this code is really problematic
Erm, it seems less problematic to be totally honest; certainly at the time. Afaict it matches the PPYP pattern, and the main thing that has changed is what Rust views as UB (to be fair, afaiu little enough is defined in the first place). The Rust 1.76 change certainly seems to mess up the PPYP pattern; is there something else that Rust folks recommend instead now?
More generally, any recommendations for where to follow and perhaps litigate that certain "optimizations" around unsafe
are potentially harmful? It's a recurring anxiety for anyone who wants solid performance, that Rust might just eventually deem any use of unsafe
UB and misoptimize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There must be some other invalid unsafe code elsewhere that got removed earlier than this PR
Yeah, pretty much all use of unsafe
is invalid, so this checks out. :D I think there are two uses of unsafe
at the moment to work around LLVM not optimizing split_at_mut
very well (in consolidation.rs
). Everything else has been removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm here because I was sifting through backlinks)
the main thing that has changed is what Rust views as UB
Out-of-bounds indexing via get_unchecked
has been documented to be UB since Rust 1.38.0, which was published in late November 2019. Not that I really expect anyone to be aware that documentation has been updated, but there was no recent change here.
The UB here has also been reliably detected by Miri for a long time, and if it is not possible to run Miri, the standard library has robust debug assertions that you can exercise today with cargo-careful or with -Zbuild-std
. Some of those checks should soon be on by default without such tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads up! Unfortunately the referenced code predates 1.38, back when the method had no documented UB. =/
Fwiw, we've run Miri a fair amount, in particular to find a miscompilation induced by another crate, and it did not detect anything. I don't know if that is surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it did not detect anything. I don't know if that is surprising.
Unfortunately, it's not. In the past, out-of-bounds slice::get_unchecked
was detected almost by accident by Stacked Borrows, so if you pass -Zmiri-disable-stacked-borrows
or -Zmiri-tree-borrows
, as I suspect some number of codebases need to in order to execute much of anything, it's wasn't checked at all. Miri understands the optimization hint that landed in 1.76 though. So this isn't a problem anymore.
The other problem is just the nature of dynamic checkers. If well-formed/non-malicious inputs never cause indexing out of bounds... well you'll never get a report from Miri. I've been meaning to write a helper to run Miri on a minimized fuzzing corpus to assist with this scenario. Still not a soundness proof though.
The standard library offers
VecDeque
which also has cheap (i.e non allocating) conversions to and fromVec
s so we can use that directly and reduce the number of unsafe calls in the project.Note that while all uses of the API of
VecQueue
were correct its methods were marked as safe but you could accidentally cause undefined behavior: