Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence functionality: API revisions #483

Merged
merged 7 commits into from
Jun 15, 2018
Merged

Sequence functionality: API revisions #483

merged 7 commits into from
Jun 15, 2018

Conversation

dhardy
Copy link
Member

@dhardy dhardy commented May 26, 2018

Lets start trying to implement dhardy#82 !

The API is not fixed but has been discussed in the above thread. Comments welcome (but please read the above first).

@pitdicker this is an upstream branch; do what you like with it!

@dhardy
Copy link
Member Author

dhardy commented May 27, 2018

I should note that this is not intended for merging within the 0.5 series, so will probably not be merged soon. At some point we should create a 0.5 branch and update master for 0.6, but not yet.

@dhardy dhardy force-pushed the seq branch 2 times, most recently from d697fd0 to 8f17ca6 Compare May 27, 2018 14:25
@dhardy
Copy link
Member Author

dhardy commented May 27, 2018

That completes the basic implementations. Still to do:

  • API critique / bikeshed — may be worth opening a new issue to discuss this
  • revise sample_indices somehow, probably (could be separate PR)
  • Is returning an iterator for SliceExt::choose_multiple appropriate? Is slice.choose_multiple(rng, m).collect() as fast as if we just output a vector?
  • any other algorithm adjustments?
  • module documentation
  • clean up / rebase

@dhardy

This comment has been minimized.

@vks
Copy link
Collaborator

vks commented May 28, 2018

Update minimum Rustc version to 1.26

Is this planned for Rand 0.6?

Instead of having choose_multiple*, couldn't we just provide an iterator that can be collected? (Would be nicer in cases where the size of the iterator is known.)

@dhardy
Copy link
Member Author

dhardy commented May 28, 2018

Updating the minimum Rustc version was to allow use of impl Trait here, but I ended up not using that anyway. No, not planned and it's probably a bit early yet. [Edit: commit removed. Saved in my 'misc' branch.]

The main motivation for choose_multiple_fill is no_std support. SliceExt::choose_multiple uses sample_indices which requires an allocator. I think any form of choose_multiple requires either O(m) storage and iteration over all n elements, or > O(m) storage (potentially O(n) as in sample_indices_inplace or O(m) plus dynamic storage for other mutated elements as in sample_indices_cache).

We could implement SliceExt::choose_multiple_fill with no_std support using an O(n) backing array, though it's probably more practical to use IteratorExt::choose_multiple_fill in this case.

I did try combining the IteratorExt::choose_multiple* functions since they use the same algorithm, but I don't think there's any safe way to make choose_multiple use the no_std-compatible choose_multiple_fill implementation. An iterator-based API still needs O(m) memory which must be passed in as a slice if there is no allocator.

@pitdicker
Copy link
Contributor

Lets start trying to implement dhardy#82 !

🎉 I'll have a good look tomorrow.

Nice to see all the deprecated functionality go, and it is less than I expected.

@dhardy
Copy link
Member Author

dhardy commented May 28, 2018

Yes, the "remove deprecations" thing will probably become it's own PR — but not before we have a 0.5 branch, and I'm not sure we should do that yet.

@pitdicker
Copy link
Contributor

We should probably be brave and have a couple of point releases first.

@dhardy dhardy added P-postpone Waiting on something else B-API Breakage: API labels Jun 1, 2018
@dhardy
Copy link
Member Author

dhardy commented Jun 2, 2018

TODO: benchmark whether partial_shuffle is actually an improvement over choose_multiple (which does in fact yield shuffled results).

@sicking
Copy link
Contributor

sicking commented Jun 3, 2018

I didn't see this PR existed before adding #490. There's some overlap (mainly in where the implementation of the choose methods should live), but so far they mainly look independent.

@dhardy dhardy added V-0.6 and removed P-postpone Waiting on something else labels Jun 7, 2018
@dhardy
Copy link
Member Author

dhardy commented Jun 8, 2018

@sicking would you please review this PR in more detail since it is close to your own #490? The big differences I see:

  • Weighted choices: I haven't done that yet; hopefully it can be added into this API somehow.
  • Methods on the Rng trait: IMO these methods should never be more than a wrapper around something else; I know @vks thinks most shouldn't even be there and I believe @pitdicker views these mostly as convenience methods. I guess we could keep the existing methods (choose, choose_mut and shuffle) as wrappers instead of deprecating (as in this PR), but I'm not keen on adding several new things there as your PR does.
  • Trait names: SliceRandom vs SliceExt, IteratorRandom vs IteratorExt: I don't have much opinion on this, though use rand::seq::SliceRandom; seems redundant. I don't know if SliceExt would be confusing or conflict for users?
  • This PR has a few extras methods on the traits: shuffle, choose_multiple, partial_shuffle

If some parts of this/your PR are contentious but we can find an agreeable sub-set that would also help make progress.

@dhardy dhardy added F-new-int Functionality: new, within Rand T-sequences E-question Participation: opinions wanted and removed V-0.6 labels Jun 8, 2018
Copy link
Contributor

@sicking sicking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the trait names. I whatever name that we choose, I think we should re-export the trait directly from lib.rs. So that people can do use rand::SliceX. No need to expose them to the internal module I think.

I personally like SliceRandom over SliceExt.

I guess even rand::SliceRandom is a bit redundant. But Ext feels like a pretty unclear abbreviation to me. And arguably also redundant for an extension trait.

Maybe SliceUtils?

Ultimately I don't care that much.

src/seq.rs Outdated
// Continue until the iterator is exhausted
for (i, elem) in self.enumerate() {
let k = rng.gen_range(0, i + 2);
if k == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for the temporary k? Seems like rng.gen_range(...) == 0 works as well.

Also, this is a perfect place to use gen_ratio.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gen_ratio gets a big speed-up:

# current
test seq_iter_choose_from_100                ... bench:       1,053 ns/iter (+/- 55)
# using (i + 2) as u32
test seq_iter_choose_from_100                ... bench:         491 ns/iter (+/- 16)
# using TryFrom
test seq_iter_choose_from_100                ... bench:         526 ns/iter (+/- 14)

Unfortunately TryFrom is still not stable, and also doesn't support working with usize directly (rust-lang/rust#49415). I'm not certain we should use as here and because this is an iterator without known exact size we have no choice but to check each time for correctness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Let's stick with what you have. This might be an argument for making gen_ratio generic over type though. But lets worry about that separately.

let mut len = 0;
while len < amount {
if let Some(elem) = self.next() {
buf[len] = elem;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in a boundary check on each assignment. You could invert the loop and do something like:

for (i, slot) in buf.iter_mut() {
    if let Some(elem) = self.next() {
        *slot = elem;
    } else {
        return i;
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, and benchmarks show no difference. I would guess that the check can get eliminated:

        for (i, slot) in buf.iter_mut().enumerate() {
            if let Some(elem) = self.next() {
                *slot = elem;
            } else {
                // Iterator exhausted; stop early
                return i;   // length
            }
        }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer the approach this approach as it doesn't rely on the optimizer removing the check. But I'm old and crotchety :). Either way is totally fine with me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same performance, so I'd rather go with what looks nicer 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool

src/seq.rs Outdated
for (i, elem) in self.enumerate() {
let k = rng.gen_range(0, i + 1 + amount);
if k < amount {
buf[k] = elem;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar bounds check here. The existing code does

if Some(slot) = buf.get_mut(amount) {
    *slot = elem;
}

which I thought was neat.

But maybe in both these instances the optimizer is able to get rid of the bounds check? If that's the case I think what you have here is more readable.

Copy link
Member Author

@dhardy dhardy Jun 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea. How does one check that (other than by benchmark)? I tried looking at the generated code for a small example, but even the MIR is 1200 lines, and likely before optimisations. Assembly is 5800.

Since in this case there is a test k < amount directly before the buf[k] = ... (and amount = buf.len()), I would guess that the compiler can eliminate the redundant check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to play it safe and do the thing that doesn't rely on the optimizer, as long as it doesn't make the code meaningfully more ugly. Especially when in doubt. Not a big deal either way.

src/seq.rs Outdated
where R: Rng + ?Sized
{
let mut reservoir = Vec::with_capacity(amount);
reservoir.extend(self.by_ref().take(amount));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an extra indentation here.

///
/// Complexity is `O(n)` where `n` is the length of the iterator.
#[cfg(feature = "alloc")]
fn choose_multiple<R>(mut self, rng: &mut R, amount: usize) -> Vec<Self::Item>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bummer that we can't implement this by using choose_multiple_fill. Even using unsafe code I wasn't able to make something working that didn't result in more code than the current approach.

Separately, I kind'a liked the Result<Vec, Vec> approach. It definitely looked weird, but it seems likely that there's a bug if someone is trying to collect more items than exists, and returning a Result seems more likely to catch that bug. But I don't have strong feelings about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried combining the implementations too. Problem is we want allocated-on-first-use storage which Vec gives us for free but isn't compatible with buffers; some kind of trait might be able to abstract over this, but as you said it just results in more code in total.

About the return type, I'm not sure; both have their merits. However what about consistency? I think it would be possible to make SliceExt::choose_multiple return a Result too, even using iterators as it does now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about consistency. Would make sense to return a Result<Iterator, Iterator> for SliceExt::choose_multiple.

You could even argue that we should be consistent with choose and return an Result<T, Something>, but that's more dubious.

I don't have strong feelings about this. Maybe sticking with what we have now makes for the cleanest API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std library already uses Option in cases where None is kind-of an error. I don't think it makes sense to use Result<T, ()> (or NoInput instead of ()).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking that too. I think just returning a Vec/Iterator might be the cleanest solution.

src/seq.rs Outdated
i -= 1;
// lock element i in place.
self.swap(i, rng.gen_range(0, i + 1));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same change as for shuffle.

But more importantly, it's a little surprising to me to shuffle the end of the slice, rather than the beginning of it. Is there a reason for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried reversing the order of one of these shuffles and found the result slower. I copied @burdges implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the perf difference meaningful? I feel like it's quite surprising to operate on a slice from end, but I guess it could be ok as long as we are very clearly documenting it.

None
} else {
let len = self.len();
Some(&mut self[rng.gen_range(0, len)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put self.len() inline like in the previous function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Causes a failure with rustc 1.22

src/seq.rs Outdated
/// trait.SliceExt.html#method.choose_multiple).
#[cfg(feature = "alloc")]
#[derive(Debug)]
pub struct SliceChooseIter<'a, S: ?Sized + 'a, T: 'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also implement ExactSizeIterator for this struct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (comment not auto-hidden)

src/seq.rs Outdated
type Item = &'a T;

fn next(&mut self) -> Option<Self::Item> {
self.indices.next().map(|i| &(*self.slice)[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could call get_unchecked here to save the bounds-check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_eq!(v.iter().choose(&mut r), Some(&3));

let v: &[isize] = &[];
assert_eq!(v.choose(&mut r), None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests aren't very comprehensive. But I'm happy to add the tests I have in #490 over there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll look forward to it.

@sicking
Copy link
Contributor

sicking commented Jun 8, 2018

If we really want to take advantage of namespaces and avoid redundancy, maybe we should be so bold as simply calling the traits rand::Slice and rand::Iterator. But as someone that has lived in non-namespaced languages most of my life, those names definitely make me a little uncomfortable. I don't know if there would be any practical problems with it though?

#[cfg(feature = "alloc")]
fn choose_multiple<R>(&self, rng: &mut R, amount: usize) -> SliceChooseIter<Self, Self::Item>
where R: Rng + ?Sized
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one could also be made to return a Result, but it's a bit ugly. Thoughts? Doesn't seem to have much overhead, other than the ugly code.

    fn choose_multiple<R>(&self, rng: &mut R, amount: usize) ->
        Result<SliceChooseIter<Self, Self::Item>, SliceChooseIter<Self, Self::Item>>
        where R: Rng + ?Sized
    {
        if amount <= self.len() {
            Ok(SliceChooseIter {
                slice: self,
                _phantom: Default::default(),
                indices: sample_indices(rng, self.len(), amount).into_iter(),
            })
        } else {
            let mut indices: Vec<usize> = (0..self.len()).collect();
            indices.shuffle(rng);
            Err(SliceChooseIter {
                slice: self,
                _phantom: Default::default(),
                indices: indices.into_iter(),
            })
        }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The neat-freak in me thinks that maybe we should just go with the old approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"old" meaning without the Result? @pitdicker what do you think (also about using Result to indicate not enough elements on the other choose_multiple above)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"old" meaning as you have it now, not a Result. Same with the other choose_multiple.

I want to be clear that I really don't have a strong opinion either way. Other than that I think it makes sense to be consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with the same problem in dhardy#82 (comment).

I don't think returning a result type for choose_multiple(n), so an iterator can return an error if it has less than n elements, is worth it. Checking the length of the Vec is just as good an indication there are less elements. And is returning less if there is simply nothing more available not just how APIs that work with iterators operate right now? For example .take(5).collect() does not guarantee you end up with 5 elements.

What should partial_shuffle(amount) do if amount is greater then the number of elements in the slice? I suppose it should simply shuffle the entire slice, not return some option or error.

@dhardy
Copy link
Member Author

dhardy commented Jun 9, 2018

@sicking I addressed some of your comments, though not the trait names (TBD).

@dhardy
Copy link
Member Author

dhardy commented Jun 13, 2018

Looking at languages / libraries I only found the following (most languages don't seem to have this functionality within a standard library):

  • Python supports random.choice(seq), random.choices(population, weights=.., cum_weights=.., k) (normal or cumulative weights can be used), random.sample(population, k)
  • GSL has gsl_ran_choose(rng, dest, k, src, n, size) (sampling without replacement) and gsl_ran_sample(rng, dest, k, src, n, size) (sampling with replacement)

In both cases, k is number of samples to take. GSL uses n for the number of elements in the source array and size for the size of elements.

We normally use verbs for functions (choose) rather than nouns (choice). Perhaps we should follow GSL's convention, at least in using the word choose for sampling-without-replacement (also used in "n choose k"; see Binomial coefficient), but we (probably?) still want both single and multiple versions. So:

  1. choose and choose_multiple
  2. choose and choose_k
  3. choose_single and choose
  4. choose_1 and choose
  5. choose (multiple variant only)

Thoughts? I prefer 1, 2 and 5 over the other options.

@vks
Copy link
Collaborator

vks commented Jun 13, 2018

What is the motivation for the variant that chooses one element? I don't see how choose_1(&mut rng) is preferable over choose(&mut rng, 1), so I would prefer option 5 over the others.

@dhardy
Copy link
Member Author

dhardy commented Jun 13, 2018

There is some motivation for the single-element version: (a) simpler, (b) Option<T> is nicer to use than [T] of length 0 or 1 and (c) doesn't require an allocator.

Edit: sorry, return type is Iterator<Item=T>, so motivation (b) is negligible.

I suppose we could add SliceRandom::choose_multiple_fill using only Floyd's algorithm for sampling; i.e. it wouldn't need an allocator but would be slow for large numbers of elements (more than 400-500 samples; also not optimal when the sample space is small relative to the sample size). Then motivation (c) is negated.

Edit: no we can't easily; we need a buffer for the index sampling; using the supplied element buffer would require transmuting and that the element size is at least large enough to represent the largest index and is very broken for types implementing Drop. However, IteratorRandom::choose_multiple_fill can still be used without an allocator; it's just less efficient. Alternatively we could expose a sample_indices_fill function for use without allocators.

Edit: also, sample_indices uses some heuristics to decide which algorithm to use and uses an allocator for the result; if you know that only 1 element is needed then this overhead can be avoided. Currently the heuristics cannot be calculated at compile time if amount is known but length is not; we should probably add a special case here for very small amounts.

@sicking
Copy link
Contributor

sicking commented Jun 13, 2018

One thing to keep in mind is that @pitdicker has a different implementation of choose_multiple over in https://github.com/pitdicker/rand/blob/slice_random/src/seq.rs, which looks like it doesn't require allocation.

I don't know if there's a meaningful performance difference between its choose_multiple(1) and our current choose though.

I too have a preference for 1, 2, and 5 (though don't have an opinion about if we use "choose", "pick" or "sample" as base-word).

One important aspect is that it might not be worth having APIs for both "choose one" and "choose multiple" if neither is a common operation. Searching github I see 7K hits for "gen_range", almost all of which are calls to rand. Searching for "choose" gives 5K hits, of which below 1/10 are calls to rand. So gen_range is an order of magnitude more used than choose.

I think I like the option to just add a choose_k method for now. That keeps the name clear but still not too annoying even if you just want one item (whereas choose_multiple(1) arguably is a little awkward). And if people complain, or we see a lot more usage, it'd be a non-breaking change to add choose later.

@dhardy
Copy link
Member Author

dhardy commented Jun 14, 2018

A quick benchmark shows @pitdicker's pick_multiple has its merits:

test seq_shuffle_100                     ... FAILED
test seq_slice_choose_multiple_10_of_100 ... bench:         561 ns/iter (+/- 21)
test seq_slice_choose_multiple_1k_of_1M  ... bench:      50,009 ns/iter (+/- 16,255)

vs what's in this branch:

test seq_shuffle_100                     ... bench:       1,028 ns/iter (+/- 23)
test seq_slice_choose_multiple_10_of_100 ... bench:         147 ns/iter (+/- 2)
test seq_slice_choose_multiple_1k_of_1M  ... bench:      67,401 ns/iter (+/- 4,221)

(Both exclude my sample-indices optimisations in #479, though for these bounds it shouldn't matter much.)

@dhardy
Copy link
Member Author

dhardy commented Jun 14, 2018

Re-run, with two different ways of choosing 1 of 1000:

# pitdicker's branch
test seq_slice_choose_1_of_1000          ... bench:           5 ns/iter (+/- 0)
test seq_slice_choose_multiple_10_of_100 ... bench:         562 ns/iter (+/- 14)
test seq_slice_choose_multiple_1_of_1000 ... bench:          56 ns/iter (+/- 3)
test seq_slice_choose_multiple_1k_of_1M  ... bench:      48,229 ns/iter (+/- 5,231)

# this branch
test seq_slice_choose_1_of_1000              ... bench:           4 ns/iter (+/- 0)
test seq_slice_choose_multiple_10_of_100     ... bench:         146 ns/iter (+/- 1)
test seq_slice_choose_multiple_1_of_1000     ... bench:          78 ns/iter (+/- 5)

So I think there's good reason to have a single-sample choose method (#479 may help but definitely not this much).

Edit: I merged #479 into this branch and got:

test seq_iter_choose_from_100                ... bench:         550 ns/iter (+/- 2)
test seq_iter_choose_multiple_10_of_100      ... bench:       1,129 ns/iter (+/- 10)
test seq_iter_choose_multiple_fill_10_of_100 ... bench:       1,099 ns/iter (+/- 57)
test seq_shuffle_100                         ... bench:       1,027 ns/iter (+/- 20)
test seq_slice_choose_1_of_1000              ... bench:           4 ns/iter (+/- 0)
test seq_slice_choose_multiple_10_of_100     ... bench:         112 ns/iter (+/- 1)
test seq_slice_choose_multiple_1_of_1000     ... bench:          28 ns/iter (+/- 2)
test seq_slice_choose_multiple_1k_of_1M      ... bench:      60,218 ns/iter (+/- 1,671)

The seq_slice_choose_multiple_1_of_1000 benchmark went from 78 → 28 ns, but still much greater than 4 ns!

@dhardy
Copy link
Member Author

dhardy commented Jun 14, 2018

I cleaned up this branch, squashing most of the fixes.

IteratorRandom::choose now uses gen_bool instead of gen_ratio because casting usizeu32f64 doesn't make much sense. I also removed the checks, because if you're iterating over 2^53 elements then accuracy isn't going to be your biggest problem!

I kept the method names choose and choose_multiple for now. The brevity of choose_k is nice, but I don't know if it would be obvious enough why we use k.

It would be nice to get this merged now, even if some details might still change. Any more comments before doing so? I'm happy to wait a day or two for reviews.

My observations:

Copy link
Contributor

@sicking sicking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. Just some minor comments.

benches/seq.rs Outdated
let mut buf = [0; 1];
b.iter(|| {
for (v, slot) in x.choose_multiple(&mut rng, buf.len()).zip(buf.iter_mut()) {
*slot = *v;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to make this more comparable to the choose_1_of_1000 test, you'll want to avoid the write into the buffer here. Something like the following feels like the most fair comparison:

b.iter(|| {
    x.choose_multiple(&mut rng, 1).next()
})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just calling next() doesn't do anything but return an iterator shim designed to be optimised into something else. Maybe .cloned().next()...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't calling next() return an Option<&usize>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.cloned() should make it return a copy.

Anyway, this change appears to change performance from 82ns to 79ns (higher baseline because I didn't merge other branch here).

let mut buf = [0; 10];
b.iter(|| {
for (v, slot) in x.choose_multiple(&mut rng, buf.len()).zip(buf.iter_mut()) {
*slot = *v;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this does measure both the time to call choose_multiple and the time to write into buf, whereas we only really care about the former.

You could do something like x.choose_multiple(&mut rng, 10).sum::<usize>().

But I kind'a worry that the compiler would be smart enough to realize that since all values in x are 1, that it might optimize to simply returning 10. That could be fixed by simply changing one of the values in x though.

But since the writes are consecutive, they should be fast and maybe we shouldn't worry about this? I don't have strong feelings either way.

As a complete aside, it'd be really great if there was a version of Bencher::iter which you could return an Iterator to, and it would consume the iterator. But I'm not sure if there's any appetite for changing that API since apparently the idea is to deprecate it.

Copy link
Member Author

@dhardy dhardy Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is important. (I have tested that writing to a buffer is faster than collecting into a vector.)

}

#[bench]
fn seq_iter_choose_multiple_10_of_100(b: &mut Bencher) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function need to be flagged with #[cfg(feature = "alloc")]? Or does #[bench] only work in std builds anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cfg'd the whole module and tested (cargo +nightly test --benches --no-default-features [--features=alloc]). The bit I don't get is why there are no error messages with alloc without std, but there aren't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, cool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cfg thing made the tests work by effectively removing all the benchmarks — took me a while to work out why no benchmarks were getting run. But we don't need to benchmark without std anyway.

///
/// An implementation is provided for slices. This may also be implementable for
/// other types.
pub trait SliceRandom {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be pub since we're exporting it through prelude instead? Or are we fine with committing to exposing this trait as seq::SliceRandom as part of the public API. I don't know what is common practice in Rust. This is also a very minor point which we can worry about polishing later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prelude is not supposed to be the primary place to expose any functionality. I don't really see a reason not to expose this name here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to not just rely on prelude. I guess I was partially thinking of if we should expose it as rand::seq::SliceRandom or as rand::SliceRandom (by re-exposing it in lib.rs). Not something I have strong opinions on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been wondering if we should move the Distribution trait to rand::Distribution... but for now lets leave it like this.

src/seq.rs Outdated
///
/// Depending on the implementation, complexity is expected to be `O(m)`,
/// where `m = amount`.
fn partial_shuffle<R>(&mut self, rng: &mut R, amount: usize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs should mention that the shuffled items are at the end of the passed in array. I.e. that this function shuffles from the end.

Also, why the "Depending on the implementation"? Seems like we can promise that complexity is O(m)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we mention this here it becomes part of the function specification, i.e. the implementation can't change it in case it breaks users' assumptions.

Okay for complexity comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're mutating a slice that the caller has access to, I think changing that would be a breaking change either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking in what way? Breaking changes usually mean changing the API or violating prior specification, but if there's no specification on this then there's nothing to break.

Don't forget that the remainder (first part of the slice) does get modified too (elements swapped from the last part), so there's not much the user could depend on.

src/seq.rs Outdated
/// This is an efficient method to select `amount` elements at random from
/// the slice, provided the slice may be mutated.
///
/// If you only need to chose elements randomly and `amount > self.len()/2`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"chose" -> "choose"

src/seq.rs Outdated
for (i, elem) in self.enumerate() {
let k = rng.gen_range(0, i + 1 + amount);
if k < amount {
buf[k] = elem;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be consistent about how we do this check. Right now choose_multiple_fill does if k < amount whereas choose_multiple does if let Some(spot) = reservoir.get_mut(k) {. I'd tend to using the latter but either is fine as long as we're consistent.

@sicking
Copy link
Contributor

sicking commented Jun 14, 2018

Yeah, I think we should merge this. I'll close #490 since I think the useful parts that's left there are better done as separate PRs for tests and for weighted sampling. The latter definitely needs more discussion about API.

And I think it's more important that we get the API nailed than the implementation. The latter is easy to change in future PRs, but the API is blocking other work.

benches/seq.rs Outdated
let mut buf = [0; 1];
b.iter(|| {
for (v, slot) in x.choose_multiple(&mut rng, buf.len()).zip(buf.iter_mut()) {
*slot = *v;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't calling next() return an Option<&usize>?

}

#[bench]
fn seq_iter_choose_multiple_10_of_100(b: &mut Bencher) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, cool.

///
/// An implementation is provided for slices. This may also be implementable for
/// other types.
pub trait SliceRandom {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to not just rely on prelude. I guess I was partially thinking of if we should expose it as rand::seq::SliceRandom or as rand::SliceRandom (by re-exposing it in lib.rs). Not something I have strong opinions on.

src/seq.rs Outdated
///
/// Depending on the implementation, complexity is expected to be `O(m)`,
/// where `m = amount`.
fn partial_shuffle<R>(&mut self, rng: &mut R, amount: usize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're mutating a slice that the caller has access to, I think changing that would be a breaking change either way.


fn next(&mut self) -> Option<Self::Item> {
// TODO: investigate using SliceIndex::get_unchecked when stable
self.indices.next().map(|i| &(*self.slice)[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slice::get_unchecked looks like it has been marked stable since 1.0?
https://doc.rust-lang.org/src/alloc/slice.rs.html#419-423

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but all we know about self.slice is the type bounds: 'a, S: Index<usize, Output = T> + ?Sized + 'a, T: 'a. We need to use the SliceIndex trait, which apparently is going to be marked stable soon, but with most of the methods still unstable.

@dhardy
Copy link
Member Author

dhardy commented Jun 15, 2018

Updated (rebase, but only minor tweaks)

@dhardy dhardy merged commit 8f2290f into master Jun 15, 2018
@dhardy dhardy deleted the seq branch June 15, 2018 14:15
@dhardy dhardy mentioned this pull request Jun 20, 2018
28 tasks
@dhardy dhardy modified the milestone: 0.6 release Aug 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API E-question Participation: opinions wanted F-new-int Functionality: new, within Rand
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants