Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply standard rustfmt style #223

Closed
wants to merge 3 commits into from
Closed

apply standard rustfmt style #223

wants to merge 3 commits into from

Conversation

robjtede
Copy link

@robjtede robjtede commented Jul 24, 2022

Motivation

Seems like a few of your personal preferences like single-line methods could be accomplished by requiring nightly rustfmt but I suspect you'd see lots of those get reverted in PRs by folks who automatically apply stable rustfmt in their editor.

@robjtede
Copy link
Author

robjtede commented Jul 24, 2022

The diff for #222 is about a quarter the size after the format:

~600 lines vs ~2.4k lines

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 65b6c53..ccbf423 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -51,6 +51,10 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v2
-      - name: Miri
-        run: ci/miri.sh
-
+      - name: Install Miri
+        run: |
+          rustup toolchain install nightly --component miri
+          rustup override set nightly
+          cargo miri setup
+      - name: Test with Miri
+        run: cargo miri test
diff --git a/ci/miri.sh b/ci/miri.sh
deleted file mode 100755
index 272995c..0000000
--- a/ci/miri.sh
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/sh
-
-set -ex
-
-export CARGO_NET_RETRY=5
-export CARGO_NET_TIMEOUT=10
-
-MIRI_NIGHTLY=nightly-$(curl -s https://rust-lang.github.io/rustup-components-history/x86_64-unknown-linux-gnu/miri)
-echo "Installing latest nightly with Miri: $MIRI_NIGHTLY"
-rustup default "$MIRI_NIGHTLY"
-
-rustup component add miri
-cargo miri setup
-
-cargo miri test
diff --git a/src/array_string.rs b/src/array_string.rs
index f580d77..393a037 100644
--- a/src/array_string.rs
+++ b/src/array_string.rs
@@ -380,15 +380,16 @@ impl<const CAP: usize> ArrayString<CAP> {
             None => panic!("cannot remove a char from the end of a string"),
         };
 
-        let next = idx + ch.len_utf8();
         let len = self.len();
+        let removed_len = ch.len_utf8();
+        let next = idx + removed_len;
+        let tail_len = len - next;
         unsafe {
-            ptr::copy(
-                self.as_ptr().add(next),
-                self.as_mut_ptr().add(idx),
-                len - next,
-            );
-            self.set_len(len - (next - idx));
+            // SAFETY: `idx` is in bounds because we just checked that.
+            // `next` is in bounds because we cannot contain character partially.
+            let p = self.as_mut_ptr();
+            ptr::copy(p.add(next), p.add(idx), tail_len);
+            self.set_len(len - removed_len);
         }
         ch
     }
@@ -526,37 +527,58 @@ impl<const CAP: usize> Clone for ArrayString<CAP> {
     }
 }
 
 impl<const CAP: usize> Ord for ArrayString<CAP> {
diff --git a/src/arrayvec.rs b/src/arrayvec.rs
index 9dde79b..1fdab0b 100644
--- a/src/arrayvec.rs
+++ b/src/arrayvec.rs
@@ -1,3 +1,5 @@
+use core::convert::TryInto;
+use core::ptr::NonNull;
 use std::cmp;
 use std::iter;
 use std::mem;
@@ -52,12 +54,17 @@ impl<T, const CAP: usize> Drop for ArrayVec<T, CAP> {
     }
 }
 
 impl<T, const CAP: usize> ArrayVec<T, CAP> {
@@ -487,13 +494,12 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
 
         impl<T, const CAP: usize> Drop for BackshiftOnDrop<'_, T, CAP> {
             fn drop(&mut self) {
-                if self.deleted_cnt > 0 {
+                if self.deleted_cnt > 0 && self.original_len != self.processed_len {
                     unsafe {
+                        let p = self.v.as_mut_ptr();
                         ptr::copy(
-                            self.v.as_ptr().add(self.processed_len),
-                            self.v
-                                .as_mut_ptr()
-                                .add(self.processed_len - self.deleted_cnt),
+                            p.add(self.processed_len),
+                            p.add(self.processed_len - self.deleted_cnt),
                             self.original_len - self.processed_len,
                         );
                     }
@@ -511,39 +517,50 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
             original_len,
         };
 
-        #[inline(always)]
-        fn process_one<F: FnMut(&mut T) -> bool, T, const CAP: usize, const DELETED: bool>(
+        fn process_loop<F, T, const DELETED: bool, const CAP: usize>(
+            original_len: usize,
             f: &mut F,
             g: &mut BackshiftOnDrop<'_, T, CAP>,
-        ) -> bool {
-            let cur = unsafe { g.v.as_mut_ptr().add(g.processed_len) };
-            if !f(unsafe { &mut *cur }) {
-                g.processed_len += 1;
-                g.deleted_cnt += 1;
-                unsafe { ptr::drop_in_place(cur) };
-                return false;
-            }
-            if DELETED {
-                unsafe {
-                    let hole_slot = g.v.as_mut_ptr().add(g.processed_len - g.deleted_cnt);
-                    ptr::copy_nonoverlapping(cur, hole_slot, 1);
+        ) where
+            F: FnMut(&mut T) -> bool,
+        {
+            while g.processed_len != original_len {
+                // We differ from `std::vec::Vec` here because
+                // we need to borrow whole slice in `as_mut_ptr` call
+                // which violates Stacked Borrows if we already used `cur`.
+                let slice_ptr = g.v.as_mut_ptr();
+                // SAFETY: Unchecked element must be valid.
+                let cur = unsafe { &mut *slice_ptr.add(g.processed_len) };
+                if !f(cur) {
+                    // Advance early to avoid double drop if `drop_in_place` panicked.
+                    g.processed_len += 1;
+                    g.deleted_cnt += 1;
+                    // SAFETY: We never touch this element again after dropped.
+                    unsafe { ptr::drop_in_place(cur) };
+                    // We already advanced the counter.
+                    if DELETED {
+                        continue;
+                    } else {
+                        break;
+                    }
+                }
+                if DELETED {
+                    // SAFETY: `deleted_cnt` > 0, so the hole slot must not overlap with current element.
+                    // We use copy for move, and never touch this element again.
+                    unsafe {
+                        let hole_slot = slice_ptr.add(g.processed_len - g.deleted_cnt);
+                        ptr::copy_nonoverlapping(cur, hole_slot, 1);
+                    }
                 }
+                g.processed_len += 1;
             }
-            g.processed_len += 1;
-            true
         }
 
         // Stage 1: Nothing was deleted.
-        while g.processed_len != original_len {
-            if !process_one::<F, T, CAP, false>(&mut f, &mut g) {
-                break;
-            }
-        }
+        process_loop::<F, T, false, CAP>(original_len, &mut f, &mut g);
 
         // Stage 2: Some elements were deleted.
-        while g.processed_len != original_len {
-            process_one::<F, T, CAP, true>(&mut f, &mut g);
-        }
+        process_loop::<F, T, true, CAP>(original_len, &mut f, &mut g);
 
         drop(g);
     }
@@ -634,10 +651,14 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
         let start = match range.start_bound() {
             Bound::Unbounded => 0,
             Bound::Included(&i) => i,
+            // You cannot have array bigger than usize anyway
+            // so saturating add wouldn't break anything here.
             Bound::Excluded(&i) => i.saturating_add(1),
         };
         let end = match range.end_bound() {
             Bound::Excluded(&j) => j,
+            // You cannot have array bigger than usize anyway
+            // so saturating add wouldn't break anything here.
             Bound::Included(&j) => j.saturating_add(1),
             Bound::Unbounded => len,
         };
@@ -645,21 +666,35 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
     }
 
     fn drain_range(&mut self, start: usize, end: usize) -> Drain<T, CAP> {
-        let len = self.len();
-
         // bounds check happens here (before length is changed!)
-        let range_slice: *const _ = &self[start..end];
+        let _ = &self[start..end];
 
+        // Eagerly reduce len to prevent double frees in case of `drop::<T>` panics.
         // Calling `set_len` creates a fresh and thus unique mutable references, making all
         // older aliases we created invalid. So we cannot call that function.
+        let old_len: usize = self.len.try_into().unwrap();
         self.len = start as LenUint;
 
+        let start_ptr: *mut T = self.xs.as_mut_ptr().cast();
+
+        let tail_len = (old_len - end) as LenUint;
+        self.len = start.try_into().unwrap();
         unsafe {
+            // SAFETY: length is valid because we made bounds check earlier.
+            let to_yield = end.saturating_sub(start);
+            let taken_start = start_ptr.add(start);
+            let tail_start = taken_start.add(to_yield);
+
+            let taken_start = NonNull::new(taken_start).unwrap();
+            let tail_start = NonNull::new(tail_start).unwrap();
+
             Drain {
-                tail_start: end,
-                tail_len: len - end,
-                iter: (*range_slice).iter(),
-                vec: self as *mut _,
+                vec_len: &mut self.len,
+                taken_start: taken_start,
+                to_yield: to_yield as LenUint,
+                current_ptr: taken_start,
+                tail_start: tail_start,
+                tail_len,
             }
         }
     }
@@ -941,15 +976,31 @@ where
     }
 }
 
+// Note: We cannot implement this same way as `std::vec::Drain`
+// because keeping pointer to vec would violate Stacked Borrows
+// because this pointer alias with pointers to our inner slice
+// which get invalidated.
+// `std::vec::Vec` doesn't have same problem because its buffer
+// and object itself doesn't overlap.
+
 /// A draining iterator for `ArrayVec`.
 pub struct Drain<'a, T: 'a, const CAP: usize> {
-    /// Index of tail to preserve
-    tail_start: usize,
+    /// Reference to `len` field of vec
+    vec_len: &'a mut LenUint,
+
+    /// Pointer to position which we started to drain.
+    taken_start: ptr::NonNull<T>,
+    /// How much items we need yield.
+    /// Use integer instead of pointer to track this
+    /// because it simplifies working with ZSTs.
+    to_yield: LenUint,
+    /// Points to next item to yield.
+    current_ptr: ptr::NonNull<T>,
+
+    /// Points to item right after drained range.
+    tail_start: ptr::NonNull<T>,
     /// Length of tail
-    tail_len: usize,
-    /// Current remaining range to remove
-    iter: slice::Iter<'a, T>,
-    vec: *mut ArrayVec<T, CAP>,
+    tail_len: LenUint,
 }
 
 unsafe impl<'a, T: Sync, const CAP: usize> Sync for Drain<'a, T, CAP> {}
@@ -959,21 +1010,57 @@ impl<'a, T: 'a, const CAP: usize> Iterator for Drain<'a, T, CAP> {
     type Item = T;
 
     fn next(&mut self) -> Option<Self::Item> {
-        self.iter
-            .next()
-            .map(|elt| unsafe { ptr::read(elt as *const _) })
+        if self.to_yield == 0 {
+            None
+        } else {
+            let current = self.current_ptr.as_ptr();
+            self.to_yield -= 1;
+            self.current_ptr = NonNull::new(
+                // SAFETY: we just checked that we are in range.
+                // We just shortened range.
+                unsafe { current.add(1) },
+            )
+            // Note: rustc optimizes check here even with `-Copt-level=1`
+            // https://godbolt.org/z/br6eevnbx
+            .unwrap();
+
+            Some(
+                // SAFETY: we just checked that we are in range.
+                // Range must be valid by construction.
+                // We visit every item in drained range exactly once
+                // so they are read exactly once.
+                // Alignment is valid because it points to item in slice.
+                unsafe { current.read() },
+            )
+        }
     }
 
     fn size_hint(&self) -> (usize, Option<usize>) {
-        self.iter.size_hint()
+        let remaining: usize = self.to_yield.try_into().unwrap();
+        (remaining, Some(remaining))
     }
 }
 
 impl<'a, T: 'a, const CAP: usize> DoubleEndedIterator for Drain<'a, T, CAP> {
     fn next_back(&mut self) -> Option<Self::Item> {
-        self.iter
-            .next_back()
-            .map(|elt| unsafe { ptr::read(elt as *const _) })
+        if self.to_yield == 0 {
+            None
+        } else {
+            let to_yield: usize = self.to_yield.try_into().unwrap();
+            self.to_yield -= 1;
+
+            // SAFETY: We just checked our range.
+            // We just shortened range.
+            let src = unsafe { self.current_ptr.as_ptr().add(to_yield - 1) };
+            Some(
+                // SAFETY: we just checked that we are in range.
+                // Range must be valid by construction.
+                // We visit every item in drained range exactly once
+                // so they are read exactly once.
+                // Alignment is valid because it points to item in slice.
+                unsafe { src.read() },
+            )
+        }
     }
 }
 
@@ -983,19 +1070,30 @@ impl<'a, T: 'a, const CAP: usize> Drop for Drain<'a, T, CAP> {
     fn drop(&mut self) {
         // len is currently 0 so panicking while dropping will not cause a double drop.
 
-        // exhaust self first
-        while let Some(_) = self.next() {}
+        // Drop inner values first.
+        // Use slice `drop_in_place` because it is faster than iteration over self.
+        unsafe {
+            let remaining_start = self.current_ptr.as_ptr();
+            let remaining_len: usize = self.to_yield.try_into().unwrap();
+            self.to_yield = 0;
+            // SAFETY: It is safe because we dropping only unyielded items
+            // which must be initialized by `Drain` invariant.
+            ptr::drop_in_place(core::slice::from_raw_parts_mut(
+                remaining_start,
+                remaining_len,
+            ));
+        }
 
         if self.tail_len > 0 {
             unsafe {
-                let source_vec = &mut *self.vec;
+                let dst = self.taken_start.as_ptr();
+                let src = self.tail_start.as_ptr() as *const _;
+                let tail_len: usize = self.tail_len.try_into().unwrap();
                 // memmove back untouched tail, update to new length
-                let start = source_vec.len();
-                let tail = self.tail_start;
-                let src = source_vec.as_ptr().add(tail);
-                let dst = source_vec.as_mut_ptr().add(start);
-                ptr::copy(src, dst, self.tail_len);
-                source_vec.set_len(start + self.tail_len);
+                ptr::copy(src, dst, tail_len);
+                // Set len of vec.
+                // Safe because our tail contains exactly `tail_len` living items.
+                *self.vec_len += self.tail_len;
             }
         }
     }
@@ -1050,8 +1148,8 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
     {
         let take = self.capacity() - self.len();
         let len = self.len();
-        let mut ptr = raw_ptr_add(self.as_mut_ptr(), len);
-        let end_ptr = raw_ptr_add(ptr, take);
+        let mut ptr = self.as_mut_ptr().add(len);
+
         // Keep the length in a separate variable, write it back on scope
         // exit. To help the compiler with alias analysis and stuff.
         // We update the length to handle panic in the iteration of the
@@ -1063,20 +1161,22 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
                 **self_len = len as LenUint;
             },
         };
-        let mut iter = iterable.into_iter();
-        loop {
-            if let Some(elt) = iter.next() {
-                if ptr == end_ptr && CHECK {
-                    extend_panic();
-                }
-                debug_assert_ne!(ptr, end_ptr);
-                ptr.write(elt);
-                ptr = raw_ptr_add(ptr, 1);
-                guard.data += 1;
-            } else {
-                return; // success
+        let iter = iterable.into_iter();
+        let mut current_offset = 0;
+        // Use `for_each` because it is more optimal for some iterators
+        // than calling `Iterator::next` in a loop.
+        // For example, `Chain`s extra branch is `next()` isn't optimized away.
+        iter.for_each(|elt| {
+            if CHECK && current_offset == take {
+                extend_panic();
             }
-        }
+            debug_assert_ne!(current_offset, take);
+            current_offset += 1;
+
+            ptr.write(elt);
+            ptr = ptr.add(1);
+            guard.data += 1;
+        });
     }
 
     /// Extend the ArrayVec with clones of elements from the slice;
@@ -1098,16 +1198,6 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
     }
 }
 
-/// Rawptr add but uses arithmetic distance for ZST
-unsafe fn raw_ptr_add<T>(ptr: *mut T, offset: usize) -> *mut T {
-    if mem::size_of::<T>() == 0 {
-        // Special case for ZST
-        (ptr as usize).wrapping_add(offset) as _
-    } else {
-        ptr.add(offset)
-    }
-}
-
 /// Create an `ArrayVec` from an iterator.
 ///
 /// ***Panics*** if the number of elements in the iterator exceeds the arrayvec's capacity.
diff --git a/tests/tests.rs b/tests/tests.rs
index 9b51cd9..2aeba14 100644
--- a/tests/tests.rs
+++ b/tests/tests.rs
@@ -8,6 +8,7 @@ use arrayvec::CapacityError;
 use std::mem;
 
 use std::collections::HashMap;
+use std::sync::atomic::{AtomicUsize, Ordering as AtomicOrdering};
 
 #[test]
 fn test_simple() {
@@ -368,6 +369,31 @@ fn test_drain_range_inclusive_oob() {
     v.drain(0..=0);
 }
 
+#[test]
+fn test_drain_panic_in_the_middle() {
+    static DROP_COUNTER: AtomicUsize = AtomicUsize::new(0);
+    DROP_COUNTER.store(0, AtomicOrdering::Relaxed);
+
+    struct CountOnDrop;
+
+    impl Drop for CountOnDrop {
+        fn drop(&mut self) {
+            DROP_COUNTER.fetch_add(1, AtomicOrdering::Relaxed);
+        }
+    }
+
+    let mut v = ArrayVec::from([(); 6].map(|_| CountOnDrop));
+
+    std::panic::catch_unwind(move || {
+        let mut d = v.drain(1..4);
+        d.next();
+        panic!("We want to!");
+    })
+    .expect_err("We explicitly panic");
+    // No double drops and no leaks.
+    assert_eq!(DROP_COUNTER.load(AtomicOrdering::Relaxed), 6);
+}
+
 #[test]
 fn test_retain() {
     let mut v = ArrayVec::from([0; 8]);
@@ -385,6 +411,78 @@ fn test_retain() {
     assert_eq!(&v[..], &[]);
 }
 
+#[test]
+fn test_retain_on_panics() {
+    static DROP_COUNTER: AtomicUsize = AtomicUsize::new(0);
+    DROP_COUNTER.store(0, AtomicOrdering::Relaxed);
+
+    struct CountOnDrop;
+
+    impl Drop for CountOnDrop {
+        fn drop(&mut self) {
+            DROP_COUNTER.fetch_add(1, AtomicOrdering::Relaxed);
+        }
+    }
+
+    let max_i = AtomicUsize::new(0);
+    let mut v = ArrayVec::from([(); 20].map(|_| CountOnDrop));
+    // Check what happens if predicate panics.
+    std::panic::catch_unwind({
+        let max_i = &max_i;
+        move || {
+            let mut i = 0;
+            v.retain(|_| {
+                i += 1;
+                max_i.store(i, AtomicOrdering::Relaxed);
+                if i == 10 {
+                    panic!("We want to!");
+                }
+                i & 1 == 0
+            });
+        }
+    })
+    .expect_err("We explicitly panic");
+
+    assert_eq!(max_i.load(AtomicOrdering::Relaxed), 10);
+    // No leaks and no double frees.
+    assert_eq!(DROP_COUNTER.load(AtomicOrdering::Relaxed), 20);
+
+    DROP_COUNTER.store(0, AtomicOrdering::Relaxed);
+
+    struct PanicOnDrop(usize);
+
+    impl Drop for PanicOnDrop {
+        fn drop(&mut self) {
+            if self.0 == 10 {
+                panic!("We want to!");
+            }
+        }
+    }
+
+    let max_i = AtomicUsize::new(0);
+    let mut i = 0;
+    let mut v = ArrayVec::from([(); 20].map(|_| {
+        let j = i;
+        i += 1;
+        (CountOnDrop, PanicOnDrop(j))
+    }));
+    // Check what happens if drop panics.
+    std::panic::catch_unwind({
+        let max_i = &max_i;
+        move || {
+            v.retain(|v| {
+                max_i.store(v.1 .0, AtomicOrdering::Relaxed);
+                v.1 .0 & 1 != 0
+            });
+        }
+    })
+    .expect_err("We explicitly panic");
+
+    assert_eq!(max_i.load(AtomicOrdering::Relaxed), 10);
+    // No double frees and no leaks.
+    assert_eq!(DROP_COUNTER.load(AtomicOrdering::Relaxed), 20);
+}
+
 #[test]
 #[should_panic]
 fn test_drain_oob() {

@robjtede
Copy link
Author

@bluss rebased, please consider this PR :)

@kraktus
Copy link

kraktus commented Sep 17, 2023

Hello, thank you for your work on this crate bluss, much appreciated.

Some feedback from an external contributor coding a potential patch. Not following rustfmt was quite a barrier and resulted in significant overhead for me. Due to running rustfmt out of habit after coding the things, I came to realise the project didn't follow it, and had to selectively commit my changes. For example src/arrayvec.rs had 55 patches, which resulted in some errors from my side when choosing the right ones, and more generally time and energy lost.

As a maintainer of other libs without much time I want to say I understand your situation well :)

@robjtede robjtede closed this Sep 17, 2023
@robjtede
Copy link
Author

Closing this since it's pretty out of date. I'll leave the branch in place so anyone can cherry pick commits if they'd like.

This was referenced Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants