-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tokio_stream::StreamMap allows duplicate keys #4774
Comments
PR that introduced this: #4272 |
For reference, here's an example showing that use tokio_stream;
fn main() {
let streams_from_iter = tokio_stream::StreamMap::from_iter(vec![
("key", tokio_stream::pending::<u8>()),
("key", tokio_stream::pending::<u8>())
]);
for (key, value) in streams_from_iter.iter() {
println!("{}: {:?}", key, value);
}
} Results in:
|
If this is indeed undesired behavior, then I think the simplest solution is to just iterate over the |
I agree that this is a bug. It should not allow duplicate keys. |
How concerned with the runtime performance impact are we? Given // O(n*m) or worse if the underlying `Vec` needs to be resized, potentially multiple times:
for (key, value) in iter {
self.insert(key, value);
} ... the runtime goes from [1] or worse if the underlying Alternatively, we could still use self.entries.extend(iter.into_iter()); // O(n+m)
// O((n+m) * log(n+m)):
self.entries.sort_by(|a, b|
if a.0.eq(&b.0) {
std::cmp::Ordering::Equal
} else {
std::cmp::Ordering::Less
}
);
self.entries.dedup_by(|a, b| a.0.eq(&b.0)); // O(n+m) This removes duplicates (while preserving the original keys but not their order) with a runtime of I'd prefer the former solution unless we have reason to believe this will lead to a noticeable performance impact. |
I think that going to polynomial time is fundamentally unworkable here. |
Should I'm not sure what expected values of |
Well, on one hand, the |
Also, the incoming vector in |
One more thing, the duplicates are removed using tokio/tokio-stream/src/stream_map.rs Line 447 in 199878e
|
Where is that specified?
Which, as I understand it, applies to the order of values received within a stream rather than the order of the streams themselves. All the
Am I misunderstanding something? |
@nashley I interpreted the quote you highlighted as the |
I have a proposal. If the proposal is acceptable then I'll do the code change. The proposal is to maintain a HashMap (or BTreeMap?) with key value pair as `<key, index in vector>. Before inserting we can check in the HashMap for the key and if it's present, we can get the position in the vector and perform the deletion. This approach would remove the O(n) complexity for every insert and replace it with the complexity to fetch and insert into the HashMap. I am not sure of the complexity of Rust HashMap. We can use some other map implementations if any one has suggestions. |
I would be ok with using a |
@DevSabb let me know if you want assistance implementing, documenting, or testing this. |
@nashley The change now is more complex that one I proposed. I'll take a look into the code before confirming if I wan't to pick this up. Is it okay if I confirm in couple of days? |
Of course! I can also work on implementing this if you change your mind. |
@nashley I started working on this ticket. Will let you know if I have questions. |
I have a question. I had to include the traits
|
Ah that sucks. It would require a breaking release to make this change, then. (the particular error in question is fixed by putting the |
@Darksonn Thank you! Yes, I had to include the trait I am also wondering if I can get some suggestion on how to transform the map iterator to list iterator
Error:
|
…kable. However, this implementation does not allow duplicate keys and would overwrite existing ones similar to from_iter() behaviour. Initially introduced: PR tokio-rs#4272 - tokio-rs#4272 Note: can not implement Extend trait, since this method has stricter requirements than trait, as Extend does not require Key to be 'Eq + Hash' Issue tokio-rs#4774 - tokio-rs#4774
…kable. However, this implementation does not allow duplicate keys and would overwrite existing ones similar to from_iter() behaviour. Initially introduced: PR tokio-rs#4272 - tokio-rs#4272 Note: can not implement Extend trait, since this method has stricter requirements than trait, as Extend does not require Key to be 'Eq + Hash' Issue tokio-rs#4774 - tokio-rs#4774
Version
List the versions of all
tokio
crates you are using.This is version independent, but here you go:
Platform
The output of
uname -a
(UNIX), or version and 32 or 64-bit (Windows)This is platform independent, but here you go:
Description
[short summary of the bug]
tokio_stream::StreamMap::insert
followed bytokio_stream::StreamMap::extend
allows for duplicate keys.[code sample that causes the bug]
I expected that the duplicate keys would overwrite existing ones like they are with
from_iter
(which callsinsert
, which callsremove
).However, I think it'd be best if both methods raised errors (see #4775) or at least returned a list of the replaced keys to match
insert
's functionality. This way, the calling code would at least know what keys were replaced.Instead, you can see that the
StreamMap
now contains duplicate keys:The text was updated successfully, but these errors were encountered: