-
-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simpler and faster implementation of Floyd's F2 #1277
Conversation
On my machine - an M1 MacBook I get the following benchmarks: before:
After:
|
The 8% slower performance on
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good simplification. The standard proof-by-induction of Floyd's combination algorithm can be extended to prove that the output is fully shuffled. It's a little tedious. A copy of my notes for those interested.
Proof by induction
for j in (len - amount) .. len {
let t = rng.gen_range(0..=j);
if X(t) {
choose j
} else {
choose t
}
}
Where:
m = amount
X(i) = event that i is chosen
Expect:
P(X(i)) = m / len
Standard proof by induction, using notation:
Xm(i) = event that i is chosen by (no later than) step m
Xm1(i) = event i chosen at step m - 1
Rlen = a random value from 0..=len
Case m=1 is trivial.
Assume m > 1 and correct for previous step:
Assumption: for i in 0..(len-1) { P(Xm1(i)) = (amount-1) / (len-1) }
Case: i in 0..(len-1)
Xm(i) = Xm1(i) ⊻ {Rlen=i}
P(Xm(i)) = P(Xm1(i)) + P(!Xm(i))*(1/len)
P(Xm(i)) = (m-1)/(len-1) + (len-1-m+1)/(len-1)*(1/len)
P(Xm(i)) = len*(m-1)/(len*(len-1)) + (len-1-m+1)/(len*(len-1))
P(Xm(i)) = (len*m-m)/(len*(len-1))
P(Xm(i)) = m/len
Case: i=len-1
Xm(i) = {Rlen=i} ⊻ {Rlen=k where Xm1(k)}
P(Xm(i)) = 1/len + Sum(k in 0..len){1/len * P(Xm1(k))}
P(Xm(i)) = 1/len + (len-1)/len * (m-1)/(len-1)
P(Xm(i)) = 1/len + 1/len * (m-1)
P(Xm(i)) = m/len
Ok.
Fully shuffled variant:
for j in (len - amount) .. len {
let t = rng.gen_range(0..=j);
if X(t) {
choose j
swap elts x, j
} else {
choose t
}
}
Proof by induction:
X(i,j) = event that i is chosen and appears in output at position j
Expect:
P(X(i,j)) = P(X(i)) / m = 1 / len
Assume for step m-1:
Assumption: for i in 0..(len-1), j in 0..(m-1) { P(Xm1(i,j)) = 1/(len-1) }
Case: i in 0..(len-1), j in 0..(m-1)
Xm(i,j) = if Xm1(Rlen,j) then 0 else Xm1(i,j)
Xm(i,j) = !Xm1(Rlen,j) AND Xm1(i,j)
P(Xm(i,j)) = (1 - Sum(k in 0..(len-1)){1/len * 1/(len-1)}) / (len-1)
P(Xm(i,j)) = (1 - 1/len) / (len-1)
P(Xm(i,j)) = (len - 1) / len / (len-1)
P(Xm(i,j)) = 1/len
Case: i in 0..(len-1), j=m-1: P(Xm1(i,j)) = 0
Xm(i,j) = Rlen=i
P(Xm(i,j)) = 1/len
Case: i=len-1, j=m-1:
Xm(i,j) = Rlen=i
P(Xm(i,j)) = 1/len
Case: i=len-1, j in 0..(m-1):
Xm(i,j) = Xm1(Rlen, j)
P(Xm(i,j)) = Sum(k in 0..(len-1)){1/len / (len-1) = 1/len
Ok.
The one thing that needs changing is the CHANGELOG: this is a value-breaking change. Affected API must be noted.
The previous implementation used either `Vec::insert` or a second F-Y shuffling phase to achieve fair random order. Instead, use the random numbers already drawn to achieve a fair shuffle.
Have added some notes on this, let me know if this is worded the way you need - thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
The previous implementation used either
Vec::insert
or a second F-Y shuffling phase to achieve fair random order. Instead, use the random numbers already drawn to achieve a fair shuffle.