Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for Iterator::intersperse #79524

Open
4 of 6 tasks
Tracked by #16
camelid opened this issue Nov 28, 2020 · 58 comments · Fixed by #88548
Open
4 of 6 tasks
Tracked by #16

Tracking Issue for Iterator::intersperse #79524

camelid opened this issue Nov 28, 2020 · 58 comments · Fixed by #88548
Labels
A-iterators Area: Iterators C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@camelid
Copy link
Member

camelid commented Nov 28, 2020

This is a tracking issue for the Iterator::intersperse and intersperse_with methods. It is based on itertools::intersperse and is the iterator equivalent of Vec::join.

The feature gate for the issue is #![feature(iter_intersperse)].

Public API

// core::iter

pub trait Iterator {
    fn intersperse(self, separator: Self::Item) -> Intersperse<Self>
    where
        Self: Sized,
        Self::Item: Clone;

    fn intersperse_with<G>(self, separator: G) -> IntersperseWith<Self, G>
    where
        Self: Sized,
        G: FnMut() -> Self::Item;
}

#[derive(Debug, Clone)]
pub struct Intersperse<I: Iterator> where I::Item: Clone {}

impl<I> Iterator for Intersperse<I> where I: Iterator, I::Item: Clone {
    type Item = I::Item;
}

pub struct IntersperseWith<I, G> where I: Iterator {}

impl<I, G> Debug for IntersperseWith<I, G> where I: Iterator + Debug, I::Item: Debug, G: Debug {}

impl<I, G> Clone for IntersperseWith<I, G> where I: Iterator + Clone, I::Item: Clone, G: Clone {}

impl<I, G> Iterator for IntersperseWith<I, G> where I: Iterator, G: FnMut() -> I::Item {
    type Item = I::Item;
}

Steps / History

Unresolved Questions

None.

@camelid camelid added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC A-iterators Area: Iterators labels Nov 28, 2020
@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Dec 16, 2020
@m-ou-se
Copy link
Member

m-ou-se commented Jan 13, 2021

#80567 is adding intersperse_with to this feature.

m-ou-se added a commit to m-ou-se/rust that referenced this issue Jan 14, 2021
Add Iterator::intersperse_with

This is a follow-up to rust-lang#79479, tracking in rust-lang#79524, as discussed rust-lang#79479 (comment).

~~Note that I had to manually implement `Clone` and `Debug` because `derive` insists on placing a `Clone`-bound on the struct-definition, which is too narrow. There is a long-standing issue # for this somewhere around here :-)~~

Also, note that I refactored the guts of `Intersperse` into private functions and re-used them in `IntersperseWith`, so I also went light on duplicating all the tests.

If this is suitable to be merged, the tracking issue should be updated, since it only mentions `intersperse`.

Happy New Year!

r? `@m-ou-se`
m-ou-se added a commit to m-ou-se/rust that referenced this issue Jan 14, 2021
Add Iterator::intersperse_with

This is a follow-up to rust-lang#79479, tracking in rust-lang#79524, as discussed rust-lang#79479 (comment).

~~Note that I had to manually implement `Clone` and `Debug` because `derive` insists on placing a `Clone`-bound on the struct-definition, which is too narrow. There is a long-standing issue # for this somewhere around here :-)~~

Also, note that I refactored the guts of `Intersperse` into private functions and re-used them in `IntersperseWith`, so I also went light on duplicating all the tests.

If this is suitable to be merged, the tracking issue should be updated, since it only mentions `intersperse`.

Happy New Year!

r? ``@m-ou-se``
m-ou-se added a commit to m-ou-se/rust that referenced this issue Jan 22, 2021
Expand docs on Iterator::intersperse

Unstable feature in rust-lang#79524. This expands on the docs to bring them more in line with how other methods of `Iterator` are demonstrated.
JohnTitor added a commit to JohnTitor/rust that referenced this issue Mar 14, 2021
…=joshtriplett

Implement Extend and FromIterator for OsString

Add the following trait impls:

- `impl Extend<OsString> for OsString`
- `impl<'a> Extend<&'a OsStr> for OsString`
- `impl FromIterator<OsString> for OsString`
- `impl<'a> FromIterator<&'a OsStr> for OsString`

Because `OsString` is a platform string with no particular semantics, concatenating them together seems acceptable.

I came across a use case for these trait impls in artichoke/artichoke#1089:

Artichoke is a Ruby interpreter. Its CLI accepts multiple `-e` switches for executing inline Ruby code, like:

```console
$ cargo -q run --bin artichoke -- -e '2.times {' -e 'puts "foo: #{__LINE__}"' -e '}'
foo: 2
foo: 2
```

I use `clap` for command line argument parsing, which collects these `-e` commands into a `Vec<OsString>`. To pass these commands to the interpreter for `Eval`, I need to join them together. Combining these impls with `Iterator::intersperse` rust-lang#79524 would enable me to build a single bit of Ruby code.

Currently, I'm doing something like:

```rust
let mut commands = commands.into_iter();
let mut buf = if let Some(command) = commands.next() {
    command
} else {
    return Ok(Ok(()));
};
for command in commands {
    buf.push("\n");
    buf.push(command);
}
```

If there's interest, I'd also like to add impls for `Cow<'a, OsStr>`, which would avoid allocating the `"\n"` `OsString` in the concatenate + intersperse use case.
@cyqsimon
Copy link
Contributor

cyqsimon commented Mar 28, 2021

I would like to suggest a simpler, alternative name for this method - weave. At the time of comment, the word weave is not yet used in Rust std.

Intersperse is quite verbose (especially when compared to Vec::join), but more importantly rarely used in common English. weave is much more common and concise in comparison.

@ibraheemdev
Copy link
Member

ibraheemdev commented Mar 29, 2021

Scala and haskell both use intersperse. Scalaz calls it intercalate.

@tmplt
Copy link

tmplt commented May 3, 2021

I would like to suggest a simpler, alternative name for this method - weave.

weave is not on the list of terms I would search for when looking for the expected behavior. I'd even consider it vanity.

Intersperse is quite verbose (especially when compared to Vec::join), but more importantly rarely used in common English. weave is much more common and concise in comparison.

A verbose function name describes its effect better. More so if the term is uncommon; there is then less ambiguity. intersperse does exactly what is says: intersperse elements with a common separator.

@arnabanimesh
Copy link

why not use join and join_with for iterators too?

@fosskers
Copy link

Scala and haskell both use intersperse. Scalaz calls it intercalate...

intercalate is slightly different.

intersperse :: a -> [a] -> [a] 

intercalate :: [a] -> [[a]] -> [a] 

@redgoldlace
Copy link

redgoldlace commented Jun 20, 2021

why not use join and join_with for iterators too?

I feel like join implies the creation of something monolithic from a sequence and a separator, i.e

let items = vec!["foo", "bar", "blah"];
let result = items.join(" ");

will create the string foo bar blah, and I don't have any issue with that.

But in my mind intersperse better communicates the idea of placing separators in between a sequence, without creating something monolithic using the results of that.

I suppose I mean that if I intersperse an iterator of &strs with " ", it makes sense to me that I get an iterator back.
But if I have an iterator of &strs, and the method is named join, I feel like it's a little less clear. Without looking further into it, it might seem like that I'd get a concatenated String back, or similar.
For other types it may be less unclear but I'm still unsure on whether calling the method join is a good fit.

@arnabanimesh
Copy link

arnabanimesh commented Jun 20, 2021

I feel like join implies the creation of something monolithic from a sequence and a separator

That's true, but we are using vec::join or slice::join and that respects the pattern matching and it is not monolithic. Then vec::join should be changed too. If Rust already acknowledges that join be non monolithic, then I suggest we stick with the same ideology.

But if I have an iterator of &strs, and the method is named join, I feel like it's a little less clear. Without looking further into it, it might seem like that I'd get a concatenated String back, or similar.

I'm always collecting to String after intersperse. 😅

My personal preference aside, vec::join creates a vec if the initial data type is not &str. So it is just intersperse which happens to create a String in a specific scenario.

Personally, if it were not as verbose I would have gone for intersperse. If join was not used in vec or slice, I would not root for it. But weave doesn't sound good IMO.

My main point is that let's not have two different terms do the same thing based on datatype.

@pickfire
Copy link
Contributor

pickfire commented Jun 30, 2021

Personally, if it were not as verbose I would have gone for intersperse. If join was not used in vec or slice, I would not root for it. But weave doesn't sound good IMO.

I did join for Iterator in the past #75738 which accepts the current Join that does both but it wasn't accepted because it requires allocation or Iterator is in std so it couldn't be implemented, splitting it as two functions works but may not be that discoverable.

In a sample code in helix,

        let res = sel
            .ranges
            .into_iter()
            .map(|range| format!("{}/{}", range.anchor, range.head))
            .collect::<Vec<String>>()
            .join(",");

How to join String or &String easily with &str and such? I think cases more than that we need some intervention which the API will not be ergonomic. I think maybe it only works well with &'static str.

In the current example, I believe it's too basic and limited, requires &'static str and copied.

let hello = ["Hello", "World", "!"].iter().copied().intersperse(" ").collect::<String>();

I think we need to think more about the ergonomics of the API before stabilization.

@m-ou-se
Copy link
Member

m-ou-se commented Aug 17, 2021

@rfcbot merge

@rfcbot
Copy link

rfcbot commented Aug 17, 2021

Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Aug 17, 2021
@rfcbot
Copy link

rfcbot commented Aug 21, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Aug 21, 2021
@SoniEx2
Copy link
Contributor

SoniEx2 commented Aug 22, 2021

could call it "sep" for separator?

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Aug 31, 2021
@c410-f3r
Copy link
Contributor

What is the status of this feature?

@Fishrock123
Copy link
Contributor

@c410-f3r as I mentioned above it seems to be blocked on requiring #89151

@c410-f3r
Copy link
Contributor

Thank you @Fishrock123

The plan we arrived upon is to first revert the stabilization and then plan on stabilizing the feature in a future release, @Mark-Simulacrum suggested 1.65.

Leaving aside the above initial plan described in #88967 (comment) and taking into consideration that #89151 appears to be stalled, this feature will likely take some years to be stabilized, if ever.

@Fishrock123
Copy link
Contributor

Yes, it looks like someone interested would need to take up working on #89151

@kornelski
Copy link
Contributor

This has been open for years. The nice solution of changing trait resolution had no progress for years either, and may be blocked by other refactorings too.

Could we reconsider renaming the method?

  • Rust has the doc(alias) feature, so it can still be findable by this term.

  • "intersperse" seems to be an uncommon word. English is not my native language, and I don't recall ever hearing that word in a conversation. It's not a word I'd use to find this function, and the name to me doesn't explain what the function does, so personally I don't care for that term.

How about separate_by or insert_between? I think these are simpler names that accurately explain this function.

Or lace, interlace, weave, interweave, intertwine.

@lukaslueg
Copy link
Contributor

My personal bikeshed-report: I'd suggest interweave. Unlike most alternatives, the verb addresses an action done to the iterator (as opposed to an action done to the elements, e.g. seperated_by), and doesn't sound unnecessarily mathematical (e.g. interpolate) or skeuomorphic (e.g. sprinkle, interlace, amalgamate, commingle)

@jplatte
Copy link
Contributor

jplatte commented Mar 18, 2024

I think separate(d)_by is a really good name. Unlike intersperse, interlace and interweave it's actually obvious that it takes a separator, not a second iterator like Itertools::interleave.

@camelid
Copy link
Member Author

camelid commented Mar 21, 2024

We could also consider calling it join, like the similar method Vec::join. It's straightforward and matches names in other languages (like Python). The only downside could be confusion with Vec::join, but there are already a couple of existing methods that have the same name between Iterator and Vec (e.g., first and last).

@lukaslueg
Copy link
Contributor

We could also consider calling it join, like the similar method Vec::join. It's straightforward and matches names in other languages (like Python). The only downside could be confusion with Vec::join, but there are already a couple of existing methods that have the same name between Iterator and Vec (e.g., first and last).

.join() is a misnomer on Vec imho, as the term alludes to an action done to the elements ("the elements are joined using the argument"), as opposed to the Vec ("the Vec is joined with a second Vec; which join() doesn't actually do).

If someone can make a decision, I'm willing to change the Stone Of Shame around my neck for the Stone Of Triumph and submit a PR.

@camelid
Copy link
Member Author

camelid commented Mar 26, 2024

I agree that Vec::join has a confusing name... however, I'm unclear if that means you agree that Iterator::join is a good name. It's obviously too late to change Vec::join, so if Iterator::join would match it and be no worse a name, it seems like a potentially good option. Do you mind clarifying?

@kornelski
Copy link
Contributor

kornelski commented Mar 26, 2024

It can't be join, because that's already taken by a slightly different operation that also collects. Vec::join("") returns a single joined String. So does Itertools::join.

Both join functions call their argument a separator. Therefore I think Iter::separate(_by) makes the most sense for the iterator operation that happens under the hood of join.

@camelid
Copy link
Member Author

camelid commented Mar 26, 2024

I see, that makes sense. It does seem like Iter::separate or separate_by are best then. Perhaps we do intersperse -> separate and intersperse_with -> separate_with?

@senekor
Copy link
Contributor

senekor commented Mar 26, 2024

My personal favorite would be separate_with and nothing else. The version accepting a closure is more powerful and those couple characters won't hurt anybody. I look at it like this:

.separate_with(|| ",")
.separate(",")

The only purpose of having a separate version seems to me to save a couple (8) characters. Not worth it to clutter the standard library documentation in my opinion. But it doesn't matter that much either. Having both is fine too.

@lukaslueg
Copy link
Contributor

@camelid I was arguing for not using .join(), my bad.

fn separate() -> Separate and fn separate_with() -> SeparateWith it is, then?

@workingjubilee
Copy link
Member

workingjubilee commented Mar 27, 2024

I think fn insert_between works best, particularly if we only have the closure form, but I don't mind separate_with. I agree that we only really need one form.

@camelid
Copy link
Member Author

camelid commented Mar 27, 2024

@workingjubilee my disagreement with wording like insert_between is that it sounds very imperative, whereas Iterator methods (almost) always have declarative wording in line with their behavior.

@senekor
Copy link
Contributor

senekor commented Mar 27, 2024

I think separate_with fits better than insert_between with the grammatical pattern of existing iterator methods. Maybe other people turn the method names into sentences differently, but here's what my brain does:

method mental expansion
verb_adverb verb "items" adverb (argument)
step_by step (over) items by
take_while take items while
collect_into collect the items into
separate_with separate the items with ✅
insert_between insert (separator) between the items 🤔

@lukaslueg
Copy link
Contributor

I love a good bikeshedding just as everyone, yet before this Tracking Issue is overwhelmed with it, I suggest to wait and see on the merit of #79524 (comment) in it's own right. My guess is that the PR to simply evade blockers now being on I-libs-api-nominated will yield some insights on that.

@jadebenn
Copy link

I just encountered a situation where this method would have been an excellent solution to a problem I was facing, but I was restricted to stable Rust and couldn't use another crate. It's a shame that it's been marked as unstable for so long given that it only seems to be blocked via a naming conflict with itertools.

Renaming seems like a reasonable solution to me, but I would also note that the corresponding depreciation request on the itertools github seemed to point to there not being a full understanding on their part that this was not being marked as stable because of the naming conflict. They wanted to wait for stability to deprecate their own method, leading to the current state of things.

@farazshaikh
Copy link

better to have intersperse(seprator, interval) where the marker needs to spaced every iterval.
Only problem i see is reconciling the meaning of iterval == 0. It could be

  • idx becomes NonZero(usize)
  • return the array unchanged
  • returns a new array just wit markers (this is the worst , )

Examples: [1,1,1,1].iter().intersperse(0,2) -> [1, 1, 0, 1,1, 0]
Also would need a to capture the meaning for some output like intersperse(??) -> [0, 1, 1, 0 , 1, 1]

I specifically ran into this while looking for solution for KZG encoding that requrires a tranfroming a byte stream into 32 byte symbol with the first (big edian) byte to be zero.

https://docs.eigenlayer.xyz/eigenda/rollup-guides/blob-encoding

As it stands , this API looks like a "alternate" implementation of zip() with infinite iterator over the second iterator.

@farazshaikh
Copy link

:( looks like a direct transation of this
https://hackage.haskell.org/package/base-4.19.1.0/docs/Data-List.html#v:intersperse

Would be gereat to know how to solve this usecase
[elt elt elt sep elt elt elt sep]
[sep elt elt elt sep elt elt elt]

@Amanieu
Copy link
Member

Amanieu commented Apr 29, 2024

To give an update, we're currently trying to work around the naming conflict with a language-level feature: #89151 (comment)

@bew
Copy link

bew commented Sep 7, 2024

Will there be a way to access the current separator index when using intersperse_with ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-iterators Area: Iterators C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.