Do not emit separator before elements for Intersperse on non-fused iterators#152855
Do not emit separator before elements for Intersperse on non-fused iterators#152855zakarumych wants to merge 1 commit intorust-lang:mainfrom
Intersperse on non-fused iterators#152855Conversation
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
My understanding of this part here:
Is that you have a Non-Fused iterator that could iterate like this: An example of an iterator I'm imagining that you're talking about is this: #[derive(Debug)]
struct TestCounter {
counter: usize,
}
impl Iterator for TestCounter {
type Item = usize;
fn next(&mut self) -> Option<Self::Item> {
if self.counter > 5 {
None
} else if self.counter % 2 == 0 {
self.counter += 1;
None
} else {
self.counter += 1;
Some(self.counter)
}
}
}Where this produces an iterator: if self.started {
if let Some(v) = self.next_item.take() {
Some(v)
} else {
let next_item = self.iter.next();
if next_item.is_some() {
self.next_item = next_item;
Some(self.separator.clone())
} else {
None
}
}
} else {
self.started = true;
self.iter.next()
}which I'm assuming you thought it might allow it to produce an iterator like: However, I want to point out that in impl<I: Iterator> Intersperse<I>
where
I::Item: Clone,
{
pub(in crate::iter) fn new(iter: I, separator: I::Item) -> Self {
Self { started: false, separator, next_item: None, iter: iter.fuse() }
}
}
impl<I, G> IntersperseWith<I, G>
where
I: Iterator,
G: FnMut() -> I::Item,
{
pub(in crate::iter) fn new(iter: I, separator: G) -> Self {
Self { started: false, separator, next_item: None, iter: iter.fuse() }
}
}So the iter inside Unless I'm misunderstanding something, I don't think this is something to be concerned about since |
|
@asder8215 thank you for pointing out that Either Do you know why It's not like I care a lot about non-fused iterators. I haven't used them for 10 years. But I love Rust for attention to details and this detail needs addressing IMHO ^_^ |
|
For reference, this change was introduced in #111379.
My assumption is that /// Creates a new iterator which places a copy of `separator` between adjacent
/// items of the original iterator.
Usually, On second look with the code change you made as well, I realized that doing: if self.started {
if let Some(v) = self.next_item.take() {
Some(v)
} else {
let next_item = self.iter.next();
if next_item.is_some() {
self.next_item = next_item;
Some(self.separator.clone())
} else {
None
}
}
} else {
let item = self.iter.next();
self.started = item.is_some();
item
}Will return a I think using a |
|
Oh another reason why they do |
|
@asder8215 thank you. However, In case of iterator that produces 1:
I like 1st variant more. If fusing is desired, it should be clearly stated. |
I agree with this.
I would recommend bringing this up to the discussion in the issue (or if there's a Zulip thread for this, then that would be an appropriate place to talk as well). I'd also like to hear Mark's thoughts on this. |
|
Can we update the documentation on intersperse, so it's easier to read what the guarantee change is? This behavior should be documented either way, I think. It would also be good to add some tests (perhaps after we land on the desired behavior with team input, depending on how easy it is to shift their behavior to match what's agreed on). I think my intuition is that we should document intersperse as not having any guaranteed behavior for non-fused iterators (i.e., we are free to do anything sound in terms of where separators are introduced). It's possible the methods should be I'll nominate for libs-api to discuss as well. |
|
Should this affect the size_hint if fewer items are emitted? |
|
We talked about this in today's @rust-lang/libs-api meeting. We're fine with making this change for now, but we'd like to see tests added to validate the behavior. Separately, we'd like to see an unresolved question added to the checklist in the tracking issue for intersperse, about deciding if we want to unconditionally |
Just for clarification about the change, are you saying fine with the current behavior or with one of the following suggestions made by @zakarumych here:
|
…hpratt Clarified doc comments + added tests confirming current behavior for intersperse/intersperse_with This PR builds on top of rust-lang#152855. I just added clarifying comments to `intersperse`/`intersperse_with` about its guarantees for fused iterators (and how behavior for non-fused iterators are subject to change). I also added in tests for non-fused iterators demonstrating its current behavior; fused iterators are already tested for in existing tests for `intersperse`/`intersperse_with`.
…hpratt Clarified doc comments + added tests confirming current behavior for intersperse/intersperse_with This PR builds on top of rust-lang#152855. I just added clarifying comments to `intersperse`/`intersperse_with` about its guarantees for fused iterators (and how behavior for non-fused iterators are subject to change). I also added in tests for non-fused iterators demonstrating its current behavior; fused iterators are already tested for in existing tests for `intersperse`/`intersperse_with`.
Rollup merge of #153265 - asder8215:intersperse_changes, r=jhpratt Clarified doc comments + added tests confirming current behavior for intersperse/intersperse_with This PR builds on top of #152855. I just added clarifying comments to `intersperse`/`intersperse_with` about its guarantees for fused iterators (and how behavior for non-fused iterators are subject to change). I also added in tests for non-fused iterators demonstrating its current behavior; fused iterators are already tested for in existing tests for `intersperse`/`intersperse_with`.
|
In the @rust-lang/libs-api meeting today we concluded that there might be possible usecases for both but we lack examples of unfused iterators being used in the wild. cc @asder8215 @zakarumych - which behaviour would be more appropriate? |
|
@nia-e My only use case was collecting non-fused iterator repeatedly into single collection, where "emitting separator before every element except the first one" would be more appropriate. But I ended up not using I can also imagine equally probable use case of collecting iterator into separate collections, where emission of separator only between elements in a series would make more sense. Implementation for either use case is trivial. Explaining first case in documentation is probably easier. |
Intersperse on non-fused iterators
Note that I don't personally use non-fused iterators, but I do know that it was mentioned by @cuviper in this discussion that Taking from the example mentioned in the #![feature(iter_intersperse)]
fn main() {
use std::cell::Cell;
use std::sync::mpsc::channel;
use std::thread;
use std::time::Duration;
let (sender, receiver) = channel();
// Nothing is in the buffer yet
assert!(receiver.try_iter().next().is_none());
println!("Nothing in the buffer...");
thread::spawn(move || {
sender.send("This").unwrap();
sender.send("is").unwrap();
sender.send("a").unwrap();
sender.send("sentence").unwrap();
sender.send("DONE").unwrap();
});
// Now imagine this without sleeping the thread here
// and the sender thread taking a while in between
// to send data, causing `try_recv` to hang and
// return `None`
// thread::sleep(Duration::from_secs(2));
let prev_word: Cell<Option<&str>> = Cell::new(None);
// Makes me realize that it would be useful if `intersperse_with` had an
// immutable borrow to current `Some(value)` it saw from its inner iter
// and decide what separator to add
let mut recv_intersperse_iter = receiver.try_iter().intersperse_with(|| {
if let Some(_) = prev_word.get() {
" "
} else {
// this is an impossible case to reach with the current
// implementation of `intersperse` turning the iter into
// a fused iterator
""
}
});
loop {
prev_word.set(recv_intersperse_iter.next());
// I would prefer checking to see if the `Receiver`
// is disconnected here tbh, otherwise I need to use
// smth else to signal I'm done.
if let Some(word) = prev_word.get() {
if word == "DONE" {
println!();
break;
} else {
print!("{}", word);
}
}
}
}This would print "This is a sentence " in the terminal (extra space after sentence is not ideal, but eh). It's a silly example, but there is the unfused iterator in std library and I'm sure someone may find their own niche use for this. I just think it'd be nice to have that flexibility in allowing
The first behavior seems like the more intuitive thing |
This change is related to unstable feature
#![feature(iter_intersperse)]#79524
What this does is changes behavior of
Intersperseon nonFusediterators.Particularly in case of iterator that returns
Noneonce fromnextmethod and then proceeds to yield elements normally.Without this change the
Interspersewill also returnNone, but then it will yield separator before first actual element from inner iterator.This change fixes that, so even if underlying iterator starts with
Nones,Interspersewaits for one item to be returned before inserting separators.