Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for clone_from_slice stabilization #27750

Closed
aturon opened this issue Aug 12, 2015 · 66 comments
Closed

Tracking issue for clone_from_slice stabilization #27750

aturon opened this issue Aug 12, 2015 · 66 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Aug 12, 2015

The clone_from_slice method clones a slice into another, using the clone_from method, which in principle can be more efficient than using clone.

It's not clear whether this method, or clone_from, are seeing any use at all.

@aturon aturon added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Aug 12, 2015
@bluss
Copy link
Member

bluss commented Aug 13, 2015

This method is our one API that offers "copy from x to y", i.e. a kind of copy_nonoverlapping, although so general it handles everything that's Clone. This is requested semi-often. The need is relatively larger since your .zip() codegen is nonoptimal, so writing the loop manually ends up worse.

Using clone_from() vs clone() is a paranthesis.

@gsingh93
Copy link
Contributor

I use this method. I have a large static array (a file header) and a slice (the bytes of the file), and it seems like this is the only way to copy the bytes from the slice to the array without looping.

@bluss
Copy link
Member

bluss commented Sep 18, 2015

@gsingh93 Yeah, there is one more way, see How to “memcpy” bytes in stable rust

@gsingh93
Copy link
Contributor

The subteam report this week said this feature was in the stabilization phase, so I'm nominating it for the next beta.

@aturon
Copy link
Member Author

aturon commented Nov 4, 2015

Nominating for discussion for 1.6. Discuss post.

cc @briansmith

@alexcrichton
Copy link
Member

🔔 This issue is now entering its cycle-long final comment period for stabilization 🔔

The libs team was a little up in the air about this function, but we're willing to consider stabilizing it. Some thoughts we had were:

  • The name is a little odd and there may perhaps be better alternatives. @sfackler notes that this is "the memcpy" but the name doesn't quite imply that.
  • The style of the API is somewhat unique and not always found everywhere else, so it may be good to ensure it's nice and principled as well.

@alexcrichton alexcrichton added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed I-nominated labels Nov 5, 2015
@briansmith
Copy link
Contributor

I noticed that the rand::Rng uses the word fill to refer to changing the value of every existing member of a slice. Also note that #27744 is related to this in that it would make sense to have the function that replaces all the values in a collection with a slice's value to have a similar name to the function that appends the values of a slice to a collection. Note that there's already a function named extend that is a more general form of the function that appends a slice to a vector.

So, I propose that:

  • clone_from_slice be renamed to fill_from_slice.
  • clone_from_value (the new function I proposed in the discussion to replace set_memory) be named fill.
  • push_all be renamed extend_from_slice.

@alexcrichton
Copy link
Member

That seems like sound reasoning to me, I may go even further to see if we could drop the _from part of the method names and just have fill_slice plus extend_slice, a little less wordy to type!

@briansmith
Copy link
Contributor

I may go even further to see if we could drop the _from part of the method names and just have fill_slice plus extend_slice, a little less wordy to type!

Without _from, it would be ambiguous which slice gets filled. I think the _from is useful for making that more clear. Also, seems consistent with the naming of clone_from.

@briansmith
Copy link
Contributor

FWIW, the proposed fill function is also named fill in C++: http://www.cplusplus.com/reference/algorithm/fill/.

@briansmith
Copy link
Contributor

I identified some problems when trying to implement this:

  1. The name fill_from_slice implies that in x.fill_from_slice(y), every element of x will be modified. However, the definition of clone_from_slice says that clone_from_slice only modifies the first N elements if y has N elements and N < x.len(). Thus, if the current semantics are intended to be carried forward, fill_from_slice is not a good name. However, I also feel like clone_from_slice is not a good name either, because it doesn't convey this idea that only the prefix will be modified.
  2. The return value is usize but it's not marked #[must_use]. This makes the non-obvious semantics described in Thread a session or semantic context through IL #1 dangerous.
  3. clone_from_slice is part of the SliceExt trait. If it is renamed in the trait also, it would affect any other implementations of SliceExt. According to the code, SliceExt is being tracked in Tracking issue for libcore + no_std stabilization #27701. It seems it is possible to rename the slice method without renaming the trait method, but that would be weird.

@alexcrichton
Copy link
Member

That's a good point about fill not always filling, although I'm not entirely sure what to do about that other than ponder a better name. In terms of renaming that's fine, the SliceExt trait is highly unlikely to be implemented for anything else in the ecosystem (like other I/O extension traits), so it's fine to rename.

@ghost
Copy link

ghost commented Nov 24, 2015

I like the name clone_from_slice for the method clone_from_slice(&mut self, src: &[T]) -> usize as is.

On other naming suggestions:

  • fill_from_slice and fill_slice - I don't like these because fill() generally means assigning multiple elements to a single value. cf. C++'s fill() algorithm, Java's Arrays.fill(), PHP's array_fill().
  • clone_slice - I don't like omitting "from" because I agree with @briansmith that it's ambiguous. a.clone_slice(b) does not clearly indicate whether a or b is modified.
  • clone_from - Potentially confused with Clone::clone_from.
  • copy_from_slice, assign_from_slice, set_from_slice, set_to_slice, and set_slice - I don't like these because I think that it is better for the method to have the word "clone" in it so that people immediately know that Clone is involved.

I agree with @briansmith that there is potential for this API to be used incorrectly. If the destination slice has more elements than the source slice, then some elements of the destination slice will not be overwritten. For example, imagine clone_from_slice() being used to erase a sensitive area of memory, but the source slice had fewer elements. In that case, there would be self.len() - src.len() elements that would not be overwritten!

I propose the following two changes:

  1. The method panics if the source slice has fewer elements.
  2. The method does not return a value (just like Clone::clone_from).

@alexcrichton
Copy link
Member

The libs team discussed this at triage today and unfortunately we were unable to reach a conclusion. We think that the name clone_from_slice is good enough to settle on at this point, but we couldn't figure out what to do about the panicking semantics.

We're going to punt this for another cycle to discuss the panic semantics in more detail, although if they're resolved quickly we can likely also backport a stabilization to beta.

@bluss
Copy link
Member

bluss commented Dec 8, 2015

Some good news, in simple cases, clone_from_slice is replaced by memcpy. (playground link)

@dtrebbien It's not the right tool to erase sensitive memory: llvm may choose to remove the whole copy if it can infer that the destination is never read again; there's a volatile intrinsic for this instead.

I like this name & signature as a free function best: std::slice::copy(source, destination), the only problem is not mentioning clone. A free function is natural to give both arguments equal importance.

I don't object to the current behavior of using the minimum length. Is there any way we can use Result to make it harder to misuse, without introducing panics?

@alexcrichton
Copy link
Member

🔔 This issue is now entering its final comment period for stabilization 🔔

The primary point to discuss now is the panic semantics of the functions in question.

@DanielKeep
Copy link
Contributor

A thought occurs—consider the following:

let mut a = [0; 4];
let     b = [1; 5];
let mut c = [0; 6];

a = b; // what should this do?
c = b; // what should this do?

If your answer is "not compile", then I think clone_from_slice should also panic. It's effectively doing the above, except that it can't check the length part of the type until runtime.

@glaebhoerl
Copy link
Contributor

This also seems potentially related to the question of whether slicing should panic or just truncate -- quite similarly to clone_from_slice. (And what about syntax like aslice[..] = bslice[..]; which I've seen floated before? That might even overlap functionalitywise...)

@SimonSapin
Copy link
Contributor

aslice[..] = x is effectively *IndexMut::index_mut(aslice, RangeFull) = x which is *aslice = x, but = assignment doesn’t work for dynamically sized types.

Even IndexAssign probably wouldn’t support dynamically sized types.

@bluss
Copy link
Member

bluss commented Dec 18, 2015

It should work, it requires a mutable rvalue and aslice[..] can be that. It's mentioned to be valid in the PR (but not a proposed trait impl in libstd).

@bluss
Copy link
Member

bluss commented Dec 20, 2015

See rust-lang/rfcs/pull/1419, which requires some coordination for consistency due to very similar shape APIs. I do believe both that and this should exist!

@DemiMarie
Copy link
Contributor

I like the idea of a free function. Both arguments are of equal importance. std::slice::copy would be Rust's version of C's memcpy, .NET's Array.Copy, or Java's System.arrayCopy.

@alexcrichton
Copy link
Member

Closing as this was intended to be closed by #30943

@briansmith as @ubsan pointed out I think that rust-lang/rfcs#1419 will be useful for a guaranteed memcpy/memset

@bluss we talked about this at the libs team triage yesterday and the conclusion was that we're a little wary about specializing for T: Copy to use memcpy for the reason that a Clone impl can diverge from the Copy one (because Clone is manually done). We may want to discuss this further, however, (as that divergence seems crazy), but here's probably not the right place to do so. One saving grace is that we could specialize for T = u8 and some other primitives as that's likely what this is called with 99% of the time. That is likely also best discussed on a new issue or PR, however.

@ubsan the clone_from method was avoided to not conflict with Clone::clone_from specifically, and we also can't impl Clone for [T] for the Sized reasons out outlined as well. If we can one day impl Clone for slices then we can perhaps just deprecate this method.

@bluss
Copy link
Member

bluss commented Jan 21, 2016

Yeah. As the example in #27750 (comment) shows, even when a composite type like Composite is Copy, its clone implementation may be less efficient than its Copy impl, due to copying fields individually instead of as a block.

@SimonSapin
Copy link
Contributor

@alexcrichton @bluss I’ve filed #31085 and #31086.

@bluss
Copy link
Member

bluss commented Feb 3, 2016

For the record, I think clone_from is a good name. Shorter is better.

@strega-nil
Copy link
Contributor

@alexcrichton I disagree with the stabilization under the current name. clone_from is what it should be called, as it is the exact same operation as Clone::clone_from; x = y.clone() vs. x[..] = y[..].clone().

@reem
Copy link
Contributor

reem commented Feb 3, 2016

I have a weak preference for clone_from over clone_from_slice, especially given @ubsan's note about the similarity with Clone::clone_from.

I also think it would be fine to address this via a pub fn copy(src: &[u8], dst: &mut [u8]) -> usize { dst.write(src).unwrap() } in std::mem, and not have a method for it.

@reem
Copy link
Contributor

reem commented Feb 3, 2016

Another note: it would probably be better if this API returned the number of bytes copied instead of panicking if the src slice is not the same length as the dst slice.

@SimonSapin
Copy link
Contributor

@reem That’s what impl<'a> std::io::Write for &'a mut [u8] does, isn’t it?

@reem
Copy link
Contributor

reem commented Feb 3, 2016

@SimonSapin yes, except it unnecessarily wraps it in an io::Result. (I used that impl in my example implementation of mem::copy)

@briansmith
Copy link
Contributor

@SimonSapin yes, except it unnecessarily wraps it in an io::Result. (I used that impl in my example implementation of mem::copy)

And it can't be used in #![no_std].

People keep saying that this is the same as Clone::clone_from, but it's not, because the semantics around panicing are different from Clone::clone_from. That's why I avoided the name clone_from originally.

@strega-nil
Copy link
Contributor

@briansmith the issue is that the size of a slice isn't strongly typed. The only way to inform the user that you tried to clone one slice into another and it didn't work is to use panicking.

The issue that I see is that we are using a very substandard name for a fairly common operation, just because we already used that name elsewhere for a very, very uncommon operation. I don't think I've ever even seen anyone use clone_from. Yet, this rarely used method will kill usability of a function which is used all the time in low level code, memcpy (copy_from).

@reem
Copy link
Contributor

reem commented Feb 3, 2016

@ubsan @briansmith It's not true that we can only panic! We can return a value indicating the number of bytes/entries copied, which would be much nicer for the user.

@briansmith
Copy link
Contributor

@ubsan @briansmith It's not true that we can only panic! We can return a value indicating the number of bytes/entries copied, which would be much nicer for the user.

Either way, it doesn't match Clone::clone_from.

@briansmith
Copy link
Contributor

The issue that I see is that we are using a very substandard name for a fairly common operation, just because we already used that name elsewhere for a very, very uncommon operation. I don't think I've ever even seen anyone use clone_from. Yet, this rarely used method will kill usability of a function which is used all the time in low level code, memcpy (copy_from).

The fact that these discussions go on forever is what is killing the usability of these things, not an extra 6 characters to type. Somebody just needs to make decisions so they can ship.

@strega-nil
Copy link
Contributor

@briansmith This is also a rarely used function. The issue is that it's going to impact a very useful function a lot, because we're either going to have a very substandard name copy_from_slice, or an inconsistent API. Personally, I would rather have a consistent API across everything, and call it clone_from.

@briansmith
Copy link
Contributor

Honestly, I don't care that much. I've noticed, though, that even after these months of discussions, we don't have even a concrete plan for a safe copy_nonoverlapping, which is what my goal was. So, whatever makes that happen, with whatever name, is fine by me.

@bluss
Copy link
Member

bluss commented Feb 3, 2016

@briansmith rust-lang/rfcs/pull/1419 is that plan. Sure, it's not completely in safe harbour yet, but it's pretty likely to get there.

@alexcrichton
Copy link
Member

The libs team discussed this is again during triage today, and the conclusion was to stick with the name clone_from_slice. The worries about conflicting with Clone::clone_from existed and @sfackler also raised a point about calling this method on a Vec<T> for example. Disambiguating between <Vec<T> as Clone>::clone_from would be pretty difficult from <[T]>::clone_from which is inherited from the Deref impl.

@strega-nil
Copy link
Contributor

@alexcrichton okay, I can see the argument there.

@bluss
Copy link
Member

bluss commented Feb 5, 2016

@alexcrichton #27750 (comment)

@bluss we talked about this at the libs team triage yesterday and the conclusion was that we're a little wary about specializing for T: Copy to use memcpy for the reason that a Clone impl can diverge from the Copy one (because Clone is manually done)

What I tried to say in #27750 (comment) was that we add an explicit documentation item for this function, "reserving the right" to specialize it to use memcpy.

We need to document this reservation precisely because of the issues you mentioned, and if we make this clear in the docs, it shouldn't be a problem. I want that we add this guarantee now, so that it's present from the moment the API becomes stable.

@strega-nil
Copy link
Contributor

@bluss I don't think we should reserve that right. I think if someone wants to implement clone in a different way from copy, it should be their prerogative, and that they should be able to get their clone function called.

Also, I think it is super obvious that this function is not ready to be stabilized.

@bluss
Copy link
Member

bluss commented Feb 5, 2016

I don't think it is reasonable to expect Clone to be sideffectful for a type that is also Copy. Such a type is clearly broken, buggy. We should document this so that we have a clear and transparent intention to special case clone_from_slice for T: Copy types.

@briansmith
Copy link
Contributor

I started a thread on the internals message board about the guarantees of Clone and Copy: https://internals.rust-lang.org/t/assuming-clone-clone-is-equivalent-to-copy-semantics-for-all-instances-of-copy/3148/6. I agree with @bluss that Clone should be guaranteed/assumed to be equivalent to Copy for any type that implements Copy.

@bluss
Copy link
Member

bluss commented Feb 5, 2016

I'm just talking about this function and a specific documentation for it.

@strega-nil
Copy link
Contributor

It feels wrong to me. However, I don't feel too strongly about it.

If, however, we are going to use this as a memcpy instead of copy_from, I will have to come back in and talk about the name. This current name is absolutely awful.

@RandomInsano
Copy link
Contributor

Hey all, just a bump from someone who started using it: I really didn’t care what it was called. I was just happy it existed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests