Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PriorityQueue: use O(logN) algorithm for dequeue!(pq, key) #8011

Merged
merged 3 commits into from
Aug 19, 2014
Merged

Conversation

timholy
Copy link
Member

@timholy timholy commented Aug 15, 2014

This PR should speed up some of my own work by a factor of 10^4 or so. The main highlight is switching from an O(N) algorithm to one that's O(logN) in dequeue!(pq, key). That change should not be controversial.

However, I made a couple of other changes that might merit some discussion:

  • The old version contained a field o::Ordering, but Ordering is an abstract type. Consequently calls that involved lt had to use generic fallback paths, and allocated lots of memory. I added the specific ordering type to the parameters. However, this is breaking since anyone who formerly wrote pq = PriorityQueue{K,V}() will have to change it. Because it's quite irritating to have to write pq = PriorityQueue{K,V,typeof(Base.Order.Forward)}(), I added the following constructor variant: pq = PriorityQueue(K,V). I am a little worried this might cause trouble in ways I'm not thinking of. The performance enhancement from this change is not as huge at the change in the dequeue(pq, key) algorithm (as you'll see below), but it's still a factor of 2 or so and it affects basically all operations with PriorityQueues. So to me it seems worth it.
  • Because tuples still aren't fully optimized, I switched to using an immutable Pair for storing the key, value pairs.

Here are some benchmark results using a version identical to what I checked in here, but outside of Base so that it allows direct comparison in the same session:

import Base.Collections
import Collections2

function rundequeue1(pq::Collections.PriorityQueue)
    i = 0
    while !isempty(pq)
        Collections.dequeue!(pq)
    end
end
function rundequeue1(pq::Collections2.PriorityQueue)
    i = 0
    while !isempty(pq)
        Collections2.dequeue!(pq)
    end
end

function rundequeue(pq::Collections.PriorityQueue, indexes)
    for i in indexes
#         i == Collections.dequeue!(pq, i) || error("whoops")
        Collections.dequeue!(pq, i)
    end
end
function rundequeue(pq::Collections2.PriorityQueue, indexes)
    for i in indexes
#         i == Collections2.dequeue!(pq, i) || error("whoops")
        Collections2.dequeue!(pq, i)
    end
end

n = 10
r = rand(n)
p = sortperm(r)
pq = Collections.PriorityQueue(1:n, r)
# rundequeue1(pq, p)
rundequeue1(pq)
pq = Collections2.PriorityQueue(1:n, r)
# rundequeue1(pq, p)
rundequeue1(pq)
pq = Collections.PriorityQueue(1:n, r)
rundequeue(pq, 1:n)
pq = Collections2.PriorityQueue(1:n, r)
rundequeue(pq, 1:n)
pq = Collections2.PriorityQueue(1:n, r)
@time rundequeue1(pq)  # run @time once so we get good allocation stats

n = 10^4
r = rand(n)
pq = Collections.PriorityQueue(1:n, r)
gc()
@time rundequeue1(pq)
pq = Collections2.PriorityQueue(1:n, r)
gc()
@time rundequeue1(pq)
pq = Collections.PriorityQueue(1:n, r)
gc()
@time rundequeue(pq,1:n)
pq = Collections2.PriorityQueue(1:n, r)
gc()
@time rundequeue(pq,1:n)

n = 10
# ks = [(i, i) for i = 1:n]
ks = Collections2.Pair{Int,Int}[Collections2.Pair(i,i) for i = 1:n]
r = rand(n)
sks = ks[sortperm(r)]
pq = Collections.PriorityQueue(ks, r)
rundequeue1(pq)
pq = Collections2.PriorityQueue(ks, r)
rundequeue1(pq)
pq = Collections.PriorityQueue(ks, r)
rundequeue(pq, ks)
pq = Collections2.PriorityQueue(ks, r)
rundequeue(pq, ks)

n = 10^4
# ks = (Int,Int)[(i, i) for i = 1:n]
ks = Collections2.Pair{Int,Int}[Collections2.Pair(i,i) for i = 1:n]
r = rand(n)
pq = Collections.PriorityQueue(ks, r)
gc()
@time rundequeue1(pq)
pq = Collections2.PriorityQueue(ks, r)
gc()
@time rundequeue1(pq)
pq = Collections.PriorityQueue(ks, r)
gc()
@time rundequeue(pq, ks)
pq = Collections2.PriorityQueue(ks, r)
gc()
@time rundequeue(pq, ks)

Results:

elapsed time: 0.000714793 seconds (13848 bytes allocated)  # @time warmup
elapsed time: 0.02635067 seconds (9735568 bytes allocated)   # old dequeue(pq), Int keys
elapsed time: 0.008842291 seconds (80 bytes allocated)           # new dequeue(pq), Intkeys
elapsed time: 2.598704021 seconds (104 bytes allocated)      # old dequeue(pq, key), Int keys
elapsed time: 0.017406964 seconds (104 bytes allocated)      # new dequeue(pq, key), Int keys
# Rest are with Pair{Int,Int} keys
elapsed time: 0.030444504 seconds (10851880 bytes allocated)
elapsed time: 0.01073713 seconds (80 bytes allocated)
elapsed time: 2.920255148 seconds (80 bytes allocated)
elapsed time: 0.017320849 seconds (80 bytes allocated)

Since my own problems are more likely to be in the 10^6-10^7 range, the actual expected benefits are of course much larger.

@quinnj
Copy link
Member

quinnj commented Aug 19, 2014

Just a speedup of 10^4, huh? I think this is great and 0.4 is the time to break stuff, right? (particularly for big speedups).

@StefanKarpinski
Copy link
Member

Yeah, I'd say go for it. Put the change in NEWS and describe how it breaks.

timholy added a commit that referenced this pull request Aug 19, 2014
PriorityQueue: use O(logN) algorithm for dequeue!(pq, key)
@timholy timholy merged commit 4e907b8 into master Aug 19, 2014
@timholy timholy deleted the teh/pq branch August 19, 2014 15:46
@timholy
Copy link
Member Author

timholy commented Aug 19, 2014

@quinnj, yes, I'm hoping to use this as my submission for the next contestant round on AmericanPerformanceIdol.

@StefanKarpinski
Copy link
Member

LoL @ AmericanPerformanceIdol

@timholy
Copy link
Member Author

timholy commented Aug 19, 2014

Documentation done. Thanks for looking!

@IainNZ
Copy link
Member

IainNZ commented Aug 20, 2014

A thought for smaller breaking changes like this: if its not 100% obvious how to maintain 0.3 and 0.4 compatibility, it would be useful to provide a snippet showing how to do so

@JeffBezanson
Copy link
Member

To harp on one of my usual themes, it would have been better to do the optimization and adding the parameter in separate commits. Since it's early in 0.4 I think we'll be able to keep all of this though.

@timholy
Copy link
Member Author

timholy commented Aug 20, 2014

Will keep that in mind. Thanks for the reminder.

stevengj added a commit that referenced this pull request Aug 22, 2014
@fxbrain fxbrain mentioned this pull request Aug 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants