-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement the SortBy
stream adapter
#43
Conversation
The motivation behind this patch is to replace `arrayvec` by `smallvec`. Why? Because `arrayvec` cannot grow, it's an array-baked vector. Once the full capacity is reached, it cannot grow any more. That was fine up until now because it was only used by the `Limit` stream adapter which needs only 2 values in a buffer that is typed by an `ArrayVec`. But for other stream adapters, it's a limitation. What if a stream adapter needs a buffer of more than 2 values? Ideally, we want to preserve the stack-allocated property for obvious performance reasons (cache locality, reducing allocator traffic and so on). However, we also want to be able to fallback to the heap for larger allocations. `smallvec` provides this feature. Hence, it does what `arrayvec` does, with the addition of heap-allocated fallback when it's needed.
This type alias is renamed to `VectorDiffContainerStreamBuffer`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few general comments before I dive into the implementation:
- I don't like that
Limit
now usesSmallVec
. I'd prefer to rename the associatedBuffer
type back toLimitBuf
and introduce a separateSortBuf
that's either aSmallVec
or just a regularVec
for the non-batching subscriber. If you consider theSmallVec
usage important I don't mind the extra dependency, butLimit
should continue usingArrayVec
IMO. In that case I have a slight preference for usingSmallVec<[T; 4]>
overSmallVec<[T; 2]>
but that's just my personal feeling about what might be a reasonable cut-off for stack vs. heap, feel free to ignore it. - The name
sort
is used instd
and other libraries to sort based on the element type'sOrd
implementation. The closure-taking form is calledsort_by
, so the same name should be used here. We can keepsort
for the module name though and reuse most of the implementation for aSort
adapter that does not take a sorting function (andSortByKey
that takes a key function) later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rough first review
VectorDiffContainerStreamElement, | ||
}; | ||
|
||
type UnsortedIndex = usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Filter
adapter, the "index of the element in the input vector" is called the original index. Would you mind using the same name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe the naming should be identical in all stream adapters. The semantics take the priority over the consistency in some cases, and I reckon this is one of the cases. What matters —to me— is that we have facing an index that represented the unsorted position of the value. Whether it is the original or not the original index isn't really important. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the semantics are what's important definitely. I still think it's a clearer name. The input / original list could already be sorted ;)
I'd also be okay with input / output index naming (for all adapters that re-order or filter items), how do you feel about that option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want all stream adapters to use the same terminology?
`Sort` is `VectorDiff` stream adapter that presents a sorted view of the underlying `ObservableVector` items. ```rust use eyeball_im::{ObservableVector, VectorDiff}; use eyeball_im_util::vector::VectorObserverExt; use imbl::vector; use std::cmp::Ordering; use stream_assert::{assert_closed, assert_next_eq, assert_pending}; // A comparison function that is used to sort our // `ObservableVector` values. fn cmp<T>(left: &T, right: &T) -> Ordering where T: Ord, { left.cmp(right) } // Our vector. let mut ob = ObservableVector::<char>::new(); let (values, mut sub) = ob.subscribe().sort(cmp); // ^^^ // | our comparison function assert!(values.is_empty()); assert_pending!(sub); // Append multiple unsorted values. ob.append(vector!['d', 'b', 'e']); // We get a `VectorDiff::Append` with sorted values! assert_next_eq!(sub, VectorDiff::Append { values: vector!['b', 'd', 'e'] }); // Let's recap what we have. `ob` is our `ObservableVector`, // `sub` is the “sorted view”/“sorted stream” of `ob`: // | `ob` | d b e | // | `sub` | b d e | // Append other multiple values. ob.append(vector!['f', 'g', 'a', 'c']); // We get three `VectorDiff`s! assert_next_eq!(sub, VectorDiff::PushFront { value: 'a' }); assert_next_eq!(sub, VectorDiff::Insert { index: 2, value: 'c' }); assert_next_eq!(sub, VectorDiff::Append { values: vector!['f', 'g'] }); // Let's recap what we have: // | `ob` | d b e f g a c | // | `sub` | a b c d e f g | // ^ ^ ^^^ // | | | // | | with `VectorDiff::Append { .. }` // | with `VectorDiff::Insert { index: 2, .. }` // with `VectorDiff::PushFront { .. }` // Technically, `Sort` emits `VectorDiff` that mimics a sorted `Vector`. drop(ob); assert_closed!(sub); ```
Sort
stream adapterSortBy
stream adapter
I've written a long comment but it's been deleted… Damn… I'll try to summarize 😛 .
So. Inside Inside
We can make To conclude, I don't see the benefit of using |
Just a note that I’m improving the main “find the position” algorithm to use a binary search. It’s coming soon. |
I know. I like the simplicity of a fixed capacity where nothing more is necessary. I still prefer keeping |
This patch uses a binary search when the position of a value must be found. Finding a new value based on the `compare` function can be costly, hence the use of a binary search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take care of the ArrayVec
/ SmallVec
difference separately from this PR.
Released as part of eyeball-im-util 0.5.3. |
Sort
isVectorDiff
stream adapter that presents a sorted view of the underlyingObservableVector
items. A demonstration:The code above demonstrates how a
VectorDiff::Append
is handled, but all otherVectorDiff
's variants produce variousVectorDiff
. I've tried to generateVectorDiff
that are as much correct as possible regarding the semantics the user might expect. For example, if aVectorDiff::Insert { index: 42, … }
is received but the value will be inserted at the (sorted) index 0, aVectorDiff::PushFront
will be emitted instead of aVectorDiff::Insert { index: 0, … }
.The code is extensively documented. I hope it helps to understand how this is implemented. It's pretty basic actually. The only difficulty is that it breaks your brain because you've to constantly remind: “Is this index for the sorted or the unsorted position”? I've tried to use consistent and clear naming for variables.