-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] A new stack-based vector #2990
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StackVec
would be an incorrect name for such a data structure, since there is no reason it couldn't be constructed in a static
, or on the heap via Box<StackVec<...>>
.
The RFC seems fairly light on details that really should be addressed in the RFC format, for example:
- How does the initial API of this type look? Why should it look that way?
- Where should the type be placed, and why is that the best place?
The implementation details are really not that important, as long as the RFC is reasonably implementable at all.
text/2978-stack_vec.md
Outdated
# Motivation | ||
[motivation]: #motivation | ||
|
||
`core::collections::StackVec<T>` has several use-cases and should be conveniently added into the standard library due to its importance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core::collections
does not currently exist, so that should probably be called out explicitly (it makes sense to put this structure there, but it would be a first)
text/2978-stack_vec.md
Outdated
// core::collections | ||
|
||
pub struct StackVec<T, const N: usize> { | ||
data: MaybeUninit<[T, N]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data: MaybeUninit<[T, N]>, | |
data: MaybeUninit<[T; N]>, |
text/2978-stack_vec.md
Outdated
} | ||
|
||
impl<T, const N: usize> StackVec<T, N> { | ||
// Much of the `Vec` API goes here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mirroring the Vec
API doesn't really work that well for a structure with bounded capacity, since push
can fail. What should happen then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning is: There must be two methods for everything that tries to add elements. One method panics at run-time and other one returns a Result
.
fn push(&mut self, element: T) { ... } // Panics at run-time
fn try_push(&mut self, element: T) -> Result<(), Error> { ... } // Ok if successful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mirroring the
Vec
API doesn't really work that well for a structure with bounded capacity
That's not really true at all, frankly. Many of the functions can in fact be exactly the same in every way, and for the ones where regular Vec
would have simply increased in capacity upon becoming full, you have the option of either simply panicking or returning something like Err
or None
. Or doing both at different times, if it's the case that you simultaneously provide push
and try_push
or insert
and try_insert
, for example.
text/2978-stack_vec.md
Outdated
|
||
### Compiler internals | ||
|
||
Unstable features can be used internally to make operations more efficient, e.g., `specialization`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this is relevant. The use of const generics is probably more important, since that is directly part of the API, therefore potentially blocking stabilization of StackVec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehehehe. It was originally intended for a SmallVec
-like structure tied with the unstable untagged_unions
feature.
Removed.
text/2978-stack_vec.md
Outdated
--------- | ||
``` | ||
|
||
`StackVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the stack needs to be explained in an RFC or in the standard library documentation. The implementation of StackVec
also would not really care about how or where it is allocated, so this section doesn't seem very relevant either.
text/2978-stack_vec.md
Outdated
|
||
### The current ecosystem is fine | ||
|
||
Even with all the fragmentation, different types of memory usage is an overkill in certain situations. If someone wants to use stack memory in an embedded application, then it is just a matter of grabbing an appropriated crate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that fragmentation is only really relevant when the type in question frequently is part of the API of other libraries. If it isn't, at worst you have a bit of code bloat when multiple similar crates are used by your dependencies.
@jonas-schievink Thanks for the review Hum... The discussion of a whole API is something that will probably raise some personal concerns and I was expecting to discuss it further in a possible subsequent PR. Hope my new suggestions on this subject make sense.
Changed temporally to |
It seems fairly clear to me (the However, Rust is ultimately all about paying attention to the details of a situation and selecting the best option for the current moment, so honestly it's fine to have four different crates that each offer a slightly different variant. |
text/2978-stack_based_vec.md
Outdated
pub fn pop(&mut self) -> T; // Panics at run-time | ||
|
||
pub fn push(&mut self, element: T); // Panics at run-time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite strongly feel that the default behavior for inserts returns something akin Result<(), InsertionError>
.
So push
and insert
should both return Result
.
pop
should also return Option<T>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite strongly feel that the default behavior for inserts returns something akin Result<(), InsertionError>.
The compiler will probably eliminate the conditional bounding checking so yeah, let's do this.
text/2978-stack_based_vec.md
Outdated
impl<T, const N: usize> ArrayVec<T, N> { | ||
// Constructors | ||
|
||
pub unsafe fn from_raw_parts(_ptr: *mut T, _len: usize) -> Self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel that we should add this method here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed
text/2978-stack_based_vec.md
Outdated
|
||
#[derive(Debug, Eq, Ord, PartialEq, PartialOrd)] | ||
pub enum ArrayVecError { | ||
CapacityOverflow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CapacityOverflow, | |
CapacityOverflow(T), |
Otherwise element
is lost. Considering that afaict there is no method returning both CapacityOverflow
and NoElement
it also makes more sense to split this error into two distinct types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArrayVecError
was intended to be a "catch-all" error container but doesn't seem to be necessary at all.
For a better overall coherence, pub fn push(&mut self, element: T) -> Result<(), ArrayVecError>
is now pub fn push(&mut self, element: T) -> Result<(), T>
text/2978-stack_based_vec.md
Outdated
@@ -176,49 +181,13 @@ impl<T, const N: usize> ArrayVec<T, N> { | |||
|
|||
pub fn split_off(&mut self, at: usize) -> Self; | |||
|
|||
pub fn swap_remove(&mut self, idx: usize) -> T; // Panics at run-time | |||
pub fn swap_remove(&mut self, idx: usize) -> Option<T>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swap_remove
uses a specific index, in this case I feel like panicking is actually better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for removal, I think just copying Vec
1 by 1 is the correct choice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "safe" swap_remove
version is similar to get
in the sense that both return an optional value when out-of-bounds. On the opposite direction, pub fn swap_remove(&mut self, idx: usize) { ... }
is similar to indexing, e.g., let value = slice[10];
.
In certain cases, users that don't want to panic at run-time will have to manually check if the index is within bounds before issuing a removing command. If a "panicking" swap_remove
is indeed desirable, please also consider the inclusion of try_swap_remove
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could check this at compile time to guarantee idx < N
but probably not.
text/2978-stack_based_vec.md
Outdated
pub fn from_array(array: [T; N]) -> Self; | ||
|
||
#[inline] | ||
pub fn from_array_and_len(array: [T; N], len: usize) -> Self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn from_array_and_len(array: [T; N], len: usize) -> Self; | |
pub fn from_array_and_len(array: [MaybeUninit<T>; N], len: usize) -> Self; |
I am not sure if that method is valuable, as all cases where [MaybeUninit<T>; N]
and a len
is used we should "just" use ArrayVec
directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond the semantic value, storage re-utilization and a possible future function "constification", from_array
and from_array_and_len
are right now being used to write unit-tests :)
I am not sure if that method is valuable, as all cases where [MaybeUninit; N] and a len is used we should "just" use ArrayVec directly
Something like pub const fn from_parts(array: [MaybeUninit<T>; N], len: usize) -> Self
?
text/2978-stack_based_vec.md
Outdated
} | ||
|
||
impl<T, const N: usize> ArrayVec<T, N> { | ||
// Constructors | ||
|
||
pub unsafe fn from_raw_parts(_ptr: *mut T, _len: usize) -> Self; | ||
#[inline] | ||
pub fn from_array(array: [T; N]) -> Self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this just ArrayVec::from(arr)
, I don't feel like a method is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do or will eventually const generics allow something like impl<T, const N: usize, const M: usize> impl From<[T; M]> for ArrayVec<T, N> where {M <= N}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, using feature(const_evaluatable_checked)
. This does require a N <= N
where bound when using it in a generic context though We probably will be able to detect that Param <= Param
always holds in the future, but for now that is not a priority.
Note that this from impl will be horrible for type inference, so we probably want to make it a separate method anyways imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lcnr, when you said:
Isn't this just
ArrayVec::from(arr)
, I don't feel like a method is needed here.
I personally found with staticvec that having three inherent methods, (new_from_slice
, new_from_array
, and new_from_const_array
) allowed me to really easily (with the specialization feature flag enabled) write no less than eight slice / array related From
impls in direct terms of just calling the appropriate one of those three:
impl<T, const N: usize> From<[T; N]> for StaticVec<T, N>
impl<T, const N1: usize, const N2: usize> From<[T; N1]> for StaticVec<T, N2>
impl<T: Copy, const N: usize> From<&'_ [T; N]> for StaticVec<T, N>
impl<T: Copy, const N1: usize, const N2: usize> From<&'_ [T; N1]> for StaticVec<T, N2>
impl<T: Copy, const N: usize> From<&'_ [T]> for StaticVec<T, N>
impl<T: Copy, const N: usize> From<&'_ mut [T; N]> for StaticVec<T, N>
impl<T: Copy, const N1: usize, const N2: usize> From<&'_ mut [T; N1]> for StaticVec<T, N2>
impl<T: Copy, const N: usize> From<&'_ mut [T]> for StaticVec<T, N>
I was additionally able to implement my staticvec!
macro in direct terms of new_from_const_array
, so there was even more code re-use benefits there.
You can see how the three inherent methods work starting here if you like.
In case you wind up wondering, also, the call to StaticVec:from
inside new_from_array
in turn defers to the From
impl that calls new_from_const_array
. Doing it like that is necessary to have the types inferred properly by rustc
, as if you try to just call new_from_const_array
itself there rustc
complains about N2
not being the same as N
.
70890df
to
cbabf74
Compare
I'd suggest std focus on I suspect |
5082adc
to
22b0227
Compare
As for stack based lists, may I turn your attention to the crate "seq" implementiing a sequence of items being managed via the function scopes https://crates.io/crates/seq |
|
||
pub struct ArrayVec<T, const N: usize> { | ||
data: MaybeUninit<[T; N]>, | ||
len: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be usize
when we probably can't even hit u16
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Probably" is not a great fit for the standard library.
Ideally we’d use specialization of some private trait with an associated type to choose the smallest integer type that can represent 0..=N
. However it looks like boolean conditions based on const
parameters can not be used in where
clauses today, and I don’t know if supporting that is planned at all. And this would only help where align_of::<T>()
is less than size_of::<usize>()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential hack until specializations works: Store the size in a [u8; M]
which gets converted to a usize on demand. Calculate the M
based on N
.
text/2978-stack_based_vec.md
Outdated
pub fn drain<R>(&mut self, range: R) | ||
where | ||
R: RangeBounds<usize>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this return a Drain
struct like Vec
did? Maybe we have Drain<T, const N: usize>
?
text/2978-stack_based_vec.md
Outdated
pub fn splice<R, I>(&mut self, range: R, replace_with: I) | ||
where | ||
I: IntoIterator<Item = T>, | ||
R: RangeBounds<usize>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe splice
should return a Splice
like vec did, same for Drain
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Thanks for pointing this out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the new one is still different from what Vec
did.
text/2978-stack_based_vec.md
Outdated
where | ||
F: FnMut(&mut T) -> K, | ||
K: PartialEq<K>; | ||
|
||
pub fn drain<R>(&mut self, range: R) | ||
pub fn drain<R>(&mut self, range: R) -> Option<Drain<'_, T, N>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this Option
? The current one isn't Option
IIRC.
text/2978-stack_based_vec.md
Outdated
|
||
pub fn split_off(&mut self, at: usize) -> Self; | ||
pub fn split_off(&mut self, at: usize) -> Option<Self>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this Option
?
Difference from |
Methods are checked due to out-of-bound inputs or invalid capacity, which are different from most of the Everything you see are suggestions. If people or the lib team think that a stack-based vector is worth the std addition, then I also think it is better to discuss the API design in another PR, forum thread or RFC. |
37ccc54
to
1947104
Compare
With a successful crater run and no last minute blocker, More benchmarks for extend, pop, push and truncate -> https://imgur.com/a/zdGmnW2 |
Should this specify that the ArrayVec implementation can almost all be In my try at implementing this, I was able to make |
@mbartlett21 Thanks for the heads-up. Some features like |
The implementation being able to be Thinking about it, a compile-time hashmap could be useful... |
Is the function |
When verifiable methods were proposed, many people were reluctant and this situation didn't move the RFC forward. Perhaps, the same could be said to this approach or perhaps not. Perhaps, this code arrangement could improve compile times and ergo-metrics or perhaps Perhaps? Maybe? Who knows? This RFC didn't receive an official answer so it is like walking in a dark room without proper directions. Personally, I just want this feature implemented and I will make any changes that the lib team wants |
I prefer |
A priori, anytime |
I’m not necessarily disputing that, but why? |
You often want broader functionality like |
I think rust-lang/wg-allocators#79 should be considered instead of creating vector type for each storage. |
After more than one year of waiting (...), this PR is being closed due to the lack of official answers from any person or any related or unrelated team. If desired, anyone is free to copy or reference any inner content to write another similar RFC. |
there's at least two crates with array based vector types in the ecosystem. Given that, I'm not sure adding this to stdlib is particularly necessary. It's a nifty optimization, but not a huge gain in capability, so it's fine to just be a crate. |
I think we still need something like it in std to be able to collect iterators into an array. Perhaps only a subset. |
A few comments have mentioned adding some way to allow |
In my book that actually sounds a lot like my crate iter_fixed. It has a type |
There's the desire to populate an array from regular iterators, see rust-lang/rust#81615 |
Broadly speaking, in my experience, the compiler already perfectly avoids unnecessary monomorphization of const generics, with compile times not being an issue really at all, given a baseline type along the lines of anything like this: pub struct ArrayVec<T, const N: usize> {
data: MaybeUninit<[T; N]>,
length: usize,
} I do not personally think the |
There's no reason this would be closed for good, right? (i.e., no hard "no"s from the team) The reason I ask is that I am actually a bit surprised I don't see C FFI mentioned anywhere on this thread as a good use case, since the pattern of "non-heap buffer with capacity and length" is fairly prevelant in C. Not having any user friendly representations for this in Rust means that a lot of common FFI cases tend to go through some indirection on the heap - not the end of the world in OS, but not possible in Also, I know it was already mentioned here, but adding So, use cases for ArrayVec are:
Which is a pretty serious list. |
You don't need to use CString if you put the |
That is correct, but unfortunately that precludes using any Rust functions that may return a |
Aside from const generics noted above, I think It's imho worth sorting out the VLA story for both |
Are you referring to the
Err.... Do they improve calling CFFI? I feel like they're not the best solution because:
So I might be missing something, but aren't these independent issues? |
I agree that a sized At this point I think what this RFC needs is someone to "champion" it. c410-f3r said they didn’t want to keep doing that, so if someone wants to take this up comment here to coordinate with anyone else potentially interested. Next steps are probably to discuss on Zulip to see interest from @rust-lang/libs-api members, then make a new RFC PR (possibly based on this one). |
I can if the lib team is willing to accept such feature. To give you a better idea about the situation, a very trivial and useful auxiliary function that creates custom arrays took TWO YEARS to be stabilized (rust-lang/rust#75644). There is still no answer for the lack of official feedback. Maybe because many ppl that contribute to Rustc are volunteers? Maybe because the ppl involved have other priorities? Maybe because there aren't enough human resources? The things is, it is probably best to not have high hopes for this feature. |
### Optimization | ||
|
||
Stack-based allocation is generally faster than heap-based allocation and can be used as an optimization in places that otherwise would have to call an allocator. Some resource-constrained embedded devices can also benefit from it. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest adding a new section, something like:
### C FFI
An extremely common pattern in C is to pass a buffer with a maximum size and a
current length, e.g., `myfunc(int buf[BUF_SIZE], int* current_len)`, and expect a
callee function to be able to modify the `buf` and `len` to store its return data.
Currently, the best ways to interoperate with this FFI pattern in Rust without relying
on external libraries is to create a slice and manually maintain the length as changes
are made.
This is an _ideal_ use case for `ArrayVec` in core, as it is a representation of a fixed-
capacity buffer with a current length. This greatly simplifies common patterns
required when working with existing C APIs, especially in embedded development.
Further, it would provide the backbone for an easier `&str` to `CStr` transition that
is not directly possible without going through heap-based `CString`, which is not
available in `no_std` environments.
Interoperability between C and Rust is currently "good", but providing a better
representation for common C patterns is an excellent way to grow Rust as a language
and ease adoption in C-dominant environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't know where best it would go, but some section here should really drive home the usability in no_std
environments (embedded, bare metal, kernel, safety-critical, etc). Rust is gaining traction in kernel of course, but seems like it has yet to really break into any of the other environments.
In many of these places, dependencies are frowned upon.
I would shakily raise my hand or be willing to help somebody else, but c410 just popped into the comments so maybe it won't be necessary
Two years is nothing in language world, have faith :) I'll bring something up on Zulip as @SimonSapin mentioned, or maybe with the embedded WG on Matrix, and see what they have to say |
If you are willing to push forward this feature, then feel free to take up the responsibility. My motivation to contribute to the standard library in general was severely diminished due to how things were handled in my contributions. |
I definitely feel that pain, it's far from a rust-only problem. Any big OSS project just has super slow design cycles, and usually enough pressing things that "nice to have but not directly needed" stuff can feel swept under the rug. I've had some discussion on zulip the past couple of days, and it seems like the response is more that some questions need answering than a hard no. Since I suppose a RFC is the place for that, I will go ahead and fork this and do some updates. Will update with a new PR link after that is done |
Ok all, the rough draft PR is available: #3316 |
This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new "growable" vector named
ArrayVec
that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, akaalloc::vec::Vec<T>
.Rendered