Add PagedMergeSort, a merge sort using O(√n) space #71

LSchwerdt · 2023-02-06T10:05:45Z

As far as I am aware, there is no stable sorting algorithm in Julia that uses less than O(n) auxiliary space. This PR fixes that, by implementing PagedMergeSort, a merge sort using O(√n) space while achieving about the same speed as the regular MergeSort from Base.Sort.

This is done by using a merge routine that splits the array into blocks/pages of size √n, merges into these blocks, and than rearranges them using a page table. This is illustrated below. The auxiliary space is on the right.

The basic idea is laid out here, but I use a different page table and reordering scheme than the author of that post. By using the inverse permutation in the page table (ie. storing the location of the data belonging in block i in the page table at index i), we can follow the permutation cycles during reordering without swapping, copying the correct data into the previously emptied location.

The additional data movement from reordering is almost negligible compared to the merging, and the main merging loop has only one condition (merge until block is full) as opposed to two (merge until either source array runs out), so the performance is about the same as the regular merge sort.

At deeper recursion levels, where the scratch space is big enough, normal merging is used, where one input is copied into the scratch space. When the scratch space is large enough to hold the complete subarray, the input is merged interleaved from both sides, which increases performance for random data, making PagedMergeSort actually faster than MergeSort. (But it decreases performamce for presorted data). Benchmarks are below. They can be replicated by running this Pluto notebook.

If someone needs a sorting algorithm with less than O(n) memory usage, sorting is probably a significant aspect of their whole program, so I also included a multithreaded version of PagedMergeSort.

Checklist before merging

Wait for Add comment to broken test and fix it by using Base.Sort.InitialOptimizations #70 to be merged, and include initial optimizations. This is necessary to pass the test for sorting floats with NaNs.
This PR adds a dependency on StaticArrays.jl. Maybe that can be avoided. If not, version compatibility must be checked.
PagedMerge! uses Base.midpoint, which was added in version 1.4 in Base.Sort and later moved to Base. To ensure compatibility with older Julia version, this should be accounted for. -> Redefined function here for compatibiliy with older Julia versions.
Threads.@Spawn was introduced after Julia 1.0. Since this package aims to be compatible with Julia 1.0, I see two options:
- Create a second package for multithreaded sorting algorithms with different Julia version requirements.
- Use the singlethreaded or multithreaded algorithm depending on the Julia version, when calling the multithreaded one. (implemented currently in this PR)
I am not 100% happy with using a macro (getNextBlock!) instead of a function, but I could not come up with a better way.
Rename buffer/buf to scratch or to t? -> renamed to scratch
Improve tests to find errors that were made here
- Stable sorting with by=Returns(0)
- Odd input length
CI only tests the fallback to PagedMergeSort for ThreadedPagedMergeSort. -> deleted ThreadedPagedMergeSort for now
Rebase on current master.

Some notes

The basic idea of the algorithm seems not that complicated. Maybe it was already published under a different name? Edit: I found it.
There are stable sorting algorithms that are in-place, but this is easier. And I think allocating 1 MiB to sort 8 GiB is practically in-place for most applications.
Moving the blocks before merging actually allows for even less data movement, but the required bookeeping is more complicated.
The two-ended merge can also be applied at all recursion levels to get an even faster merge sort algorithm, that uses auxilary memory the size of the input.
Splitting the processed data into blocks and rearranging them later can also be used for distributive sorting, see IPS4o.

Benchmarks

TLDR: PagedMergeSort is slightly faster than MergeSort, but slower than the default radix sort on random data. It allocates less than 400 KiB to sort a 1 GiB vector. Both MergeSort and PagedMergeSort are faster for sortperm! than the default ScratchQuicksort when sorting large vectors. The speedup depends on memory latency. ThreadedPagedMergeSort is the fastest algorithm for all tested inputs and machines (but it is the only multithreaded one).

Julia Version 1.9.0-beta3
Commit 24204a7344 (2023-01-18 07:20 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 16 × Intel(R) Core(TM) i7-7820X CPU @ 3.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake-avx512)
  Threads: 8 on 16 virtual cores
Environment:
  JULIA_DEPOT_PATH = C:\Users\schwerdt\.julia;D:\Sonstiges\JuliaTest\julia-1.9.0-beta3\local\share\julia;D:\Sonstiges\JuliaTest\julia-1.9.0-beta3\share\julia
  JULIA_IMAGE_THREADS = 1
  JULIA_LOAD_PATH = @;@v#.#;@stdlib
  JULIA_REVISE_WORKER_ONLY = 1

sort! T=UInt64 n=131_072 (1.0 MiB)
default:                      0.003205 seconds (3 allocations: 1.008 MiB)
MergeSort:                    0.009005 seconds (2 allocations: 512.047 KiB)
PagedMergeSort:               0.006894 seconds (2 allocations: 11.625 KiB)
PagedMergeSort, 8 threads:    0.002509 seconds (116 allocations: 103.766 KiB)

sortperm! T=UInt64 n=131_072 (1.0 MiB)
default:                      0.006763 seconds (2 allocations: 1.000 MiB)
MergeSort:                    0.010892 seconds (2 allocations: 512.047 KiB)
PagedMergeSort:               0.010665 seconds (2 allocations: 11.625 KiB)
PagedMergeSort, 8 threads:    0.002941 seconds (117 allocations: 104.031 KiB)

sort! T=UInt64 n=8_388_608 (64.0 MiB)
default:                      0.282875 seconds (3 allocations: 64.008 MiB, 1.28% gc time)
MergeSort:                    0.808292 seconds (2 allocations: 32.000 MiB)
PagedMergeSort:               0.631584 seconds (4 allocations: 90.656 KiB)
PagedMergeSort, 8 threads:    0.148540 seconds (209 allocations: 745.922 KiB)

sortperm! T=UInt64 n=8_388_608 (64.0 MiB)
default:                      4.226174 seconds (2 allocations: 64.000 MiB, 0.01% gc time)
MergeSort:                    1.399869 seconds (2 allocations: 32.000 MiB)
PagedMergeSort:               1.378963 seconds (4 allocations: 90.656 KiB)
PagedMergeSort, 8 threads:    0.453981 seconds (210 allocations: 746.438 KiB)

sort! T=UInt64 n=134_217_728 (1.0 GiB)
default:                      4.341338 seconds (3 allocations: 1.000 GiB, 0.07% gc time)
MergeSort:                   15.697732 seconds (2 allocations: 512.000 MiB, 0.01% gc time)
PagedMergeSort:              12.191662 seconds (4 allocations: 362.219 KiB)
PagedMergeSort, 8 threads:    2.692278 seconds (210 allocations: 2.850 MiB)

sortperm! T=UInt64 n=134_217_728 (1.0 GiB)
default:                    158.911205 seconds (2 allocations: 1.000 GiB, 0.00% gc time)
MergeSort:                   33.823217 seconds (2 allocations: 512.000 MiB, 0.27% gc time)
PagedMergeSort:              34.031642 seconds (4 allocations: 362.219 KiB)
PagedMergeSort, 8 threads:   10.372993 seconds (210 allocations: 2.851 MiB)

Julia Version 1.9.0-beta3
Commit 24204a7344 (2023-01-18 07:20 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_DEPOT_PATH = C:\Users\Luke\.julia;C:\Users\Luke\.julia\juliaup\julia-1.9.0-beta3+0.x64.w64.mingw32\local\share\julia;C:\Users\Luke\.julia\juliaup\julia-1.9.0-beta3+0.x64.w64.mingw32\share\julia
  JULIA_IMAGE_THREADS = 1
  JULIA_LOAD_PATH = @;@v#.#;@stdlib
  JULIA_NUM_THREADS = auto
  JULIA_REVISE_WORKER_ONLY = 1

sort! T=UInt64 n=131_072 (1.0 MiB)
default:                      0.010040 seconds (3 allocations: 1.008 MiB)
MergeSort:                    0.015199 seconds (2 allocations: 512.047 KiB)
PagedMergeSort:               0.010011 seconds (2 allocations: 11.625 KiB)
PagedMergeSort, 8 threads:    0.004095 seconds (115 allocations: 103.734 KiB)

sortperm! T=UInt64 n=131_072 (1.0 MiB)
default:                      0.011959 seconds (2 allocations: 1.000 MiB)
MergeSort:                    0.017257 seconds (2 allocations: 512.047 KiB)
PagedMergeSort:               0.018372 seconds (2 allocations: 11.625 KiB)
PagedMergeSort, 8 threads:    0.007372 seconds (117 allocations: 104.031 KiB)

sort! T=UInt64 n=8_388_608 (64.0 MiB)
default:                      0.668358 seconds (3 allocations: 64.008 MiB, 0.57% gc time)
MergeSort:                    0.997630 seconds (2 allocations: 32.000 MiB)
PagedMergeSort:               0.845741 seconds (4 allocations: 90.656 KiB)
PagedMergeSort, 8 threads:    0.223406 seconds (212 allocations: 746.016 KiB)

sortperm! T=UInt64 n=8_388_608 (64.0 MiB)
default:                      5.376910 seconds (2 allocations: 64.000 MiB, 0.01% gc time)
MergeSort:                    1.776138 seconds (2 allocations: 32.000 MiB)
PagedMergeSort:               1.772253 seconds (4 allocations: 90.656 KiB)
PagedMergeSort, 8 threads:    0.673832 seconds (208 allocations: 746.375 KiB)

sort! T=UInt64 n=134_217_728 (1.0 GiB)
default:                      7.397737 seconds (3 allocations: 1.000 GiB, 0.05% gc time)
MergeSort:                   19.119953 seconds (2 allocations: 512.000 MiB, 0.67% gc time)
PagedMergeSort:              15.683231 seconds (4 allocations: 362.219 KiB)
PagedMergeSort, 8 threads:    4.070665 seconds (212 allocations: 2.850 MiB)

sortperm! T=UInt64 n=134_217_728 (1.0 GiB)
default:                    176.440430 seconds (2 allocations: 1.000 GiB, 0.07% gc time)
MergeSort:                   41.310611 seconds (2 allocations: 512.000 MiB, 0.59% gc time)
PagedMergeSort:              40.876463 seconds (4 allocations: 362.219 KiB)
PagedMergeSort, 8 threads:   16.644219 seconds (214 allocations: 2.851 MiB)

Julia Version 1.9.0-beta3
Commit 24204a7344 (2023-01-18 07:20 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 24 × AMD Ryzen 9 3900X 12-Core Processor            
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, znver2)
  Threads: 12 on 24 virtual cores
Environment:
  JULIA_DEPOT_PATH = C:\Users\Luke\.julia;C:\Users\Luke\.julia\juliaup\julia-1.9.0-beta3+0.x64.w64.mingw32\local\share\julia;C:\Users\Luke\.julia\juliaup\julia-1.9.0-beta3+0.x64.w64.mingw32\share\julia
  JULIA_IMAGE_THREADS = 1
  JULIA_LOAD_PATH = @;@v#.#;@stdlib
  JULIA_REVISE_WORKER_ONLY = 1

sort! T=UInt64 n=131_072 (1.0 MiB)
default:                      0.002453 seconds (3 allocations: 1.008 MiB)
MergeSort:                    0.007748 seconds (2 allocations: 512.047 KiB)
PagedMergeSort:               0.006455 seconds (2 allocations: 11.625 KiB)
PagedMergeSort, 12 threads:   0.001617 seconds (134 allocations: 150.641 KiB)

sortperm! T=UInt64 n=131_072 (1.0 MiB)
default:                      0.005796 seconds (2 allocations: 1.000 MiB)
MergeSort:                    0.009661 seconds (2 allocations: 512.047 KiB)
PagedMergeSort:               0.009557 seconds (2 allocations: 11.625 KiB)
PagedMergeSort, 12 threads:   0.002576 seconds (130 allocations: 150.750 KiB)

sort! T=UInt64 n=8_388_608 (64.0 MiB)
default:                      0.274773 seconds (3 allocations: 64.008 MiB)
MergeSort:                    0.693271 seconds (2 allocations: 32.000 MiB)
PagedMergeSort:               0.558476 seconds (4 allocations: 90.656 KiB)
PagedMergeSort, 12 threads:   0.106098 seconds (393 allocations: 1.102 MiB)

sortperm! T=UInt64 n=8_388_608 (64.0 MiB)
default:                      1.129249 seconds (2 allocations: 64.000 MiB)
MergeSort:                    1.175415 seconds (2 allocations: 32.000 MiB)
PagedMergeSort:               1.118836 seconds (4 allocations: 90.656 KiB)
PagedMergeSort, 12 threads:   0.369979 seconds (394 allocations: 1.103 MiB)

sort! T=UInt64 n=134_217_728 (1.0 GiB)
default:                      6.363665 seconds (3 allocations: 1.000 GiB, 0.01% gc time)
MergeSort:                   13.181063 seconds (2 allocations: 512.000 MiB)
PagedMergeSort:              10.717851 seconds (4 allocations: 362.219 KiB)
PagedMergeSort, 12 threads:   1.958344 seconds (401 allocations: 4.285 MiB)

sortperm! T=UInt64 n=134_217_728 (1.0 GiB)
default:                     34.260657 seconds (2 allocations: 1.000 GiB, 0.00% gc time)
MergeSort:                   29.027326 seconds (2 allocations: 512.000 MiB)
PagedMergeSort:              28.057972 seconds (4 allocations: 362.219 KiB)
PagedMergeSort, 12 threads:   7.543024 seconds (394 allocations: 4.286 MiB)

Adds the PagedMergeSort algorithm, a merge sort with O(sqrt n) auxiliary space usage.

codecov-commenter · 2023-02-07T09:02:13Z

Codecov Report

Merging #71 (d367ee8) into master (ef22e53) will increase coverage by 1.63%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #71      +/-   ##
==========================================
+ Coverage   89.16%   90.80%   +1.63%     
==========================================
  Files           1        1              
  Lines         360      511     +151     
==========================================
+ Hits          321      464     +143     
- Misses         39       47       +8

Files Changed	Coverage Δ
src/SortingAlgorithms.jl	`90.80% <100.00%> (+1.63%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

- unify function signatures - remove trailing whitespace - rename variables - always use space after comma in fuction definitions/calls - improve comments

LilithHafner · 2023-02-11T17:22:34Z

Thank you! This is a valuable contribution! I hope you don't mind some critique :)

It's a very large sorting algorithm so I have yet to read it cover-to-cover. If possible, it would be nice to get similar or better performance with a simpler implementation, but that can be tricky. That said, I have performed some black-box analysis with a couple results:

First, a correctness issue, I ran this script to check stability and found a counterexample:

@testset begin
    for T in (Float64, Int, UInt8)
        for alg in [ThreadedPagedMergeSort, PagedMergeSort]
            for order in [Forward, Reverse, By(identity), By(abs), By(Returns(0)), By(Base.Fix2(÷, 100))]
                fails = Int[]
                for n in vcat(0:30, 40:10:100, 110:50:1000)
                    v = rand(T, n)
                    # @test sort(v; order) == sort(v; alg, order)
                    sort(v; order) == sort(v; alg, order) || push!(fails, n)
                end
                if !isempty(fails)
                    println("$(T) $(alg) $(typeof(order))\n\t$(fails)")
                end
            end
        end
    end
end

julia> issorted(sort(1:100, by=Returns(0), alg=PagedMergeSort))
false

Second, some notes on benchmarking, use BenchmarkTools.@btime rather than @time for more accurate results. @btime executes the test expression repeatedly and takes the minimum to reduce noise. I also prefer to work with benchmarking code that can be pasted into a REPL because that is more widely and quickly reproducible than a Pluto notebook. For example, the following can be includeed or pasted into a REPL

using BenchmarkTools, SortingAlgorithms, Random, Test
versioninfo()
for i in 0:5
    n = 17^i
    println("sort!(rand(Int, $n))")
    v = rand(Int, n)
    print("Default:  "); @btime sort!($v) setup=(rand!($v)) evals=1
    print("Paged MS: "); @btime sort!($v; alg=PagedMergeSort) setup=(rand!($v)) evals=1
    print("Threaded: "); @btime sort!($v; alg=ThreadedPagedMergeSort) setup=(rand!($v)) evals=1
end

On my computer, it produces the following results:

Julia Version 1.9.0-beta4
Commit b75ddb787ff (2023-02-07 21:53 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 4 × Intel(R) Core(TM) i5-8210Y CPU @ 1.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 2 on 2 virtual cores

sort!(rand(Int, 1))
Default:    63.000 ns (0 allocations: 0 bytes)
Paged MS:   64.000 ns (0 allocations: 0 bytes)
Threaded:   63.000 ns (0 allocations: 0 bytes)
sort!(rand(Int, 17))
Default:    248.000 ns (0 allocations: 0 bytes)
Paged MS:   390.000 ns (2 allocations: 256 bytes)
Threaded:   386.000 ns (2 allocations: 256 bytes)
sort!(rand(Int, 289))
Default:    5.214 μs (1 allocation: 2.44 KiB)
Paged MS:   8.231 μs (2 allocations: 704 bytes)
Threaded:   8.453 μs (2 allocations: 704 bytes)
sort!(rand(Int, 4913))
Default:    91.831 μs (3 allocations: 46.67 KiB)
Paged MS:   206.361 μs (2 allocations: 2.38 KiB)
Threaded:   203.543 μs (2 allocations: 2.38 KiB)
sort!(rand(Int, 83521))
Default:    1.757 ms (3 allocations: 660.80 KiB)
Paged MS:   4.627 ms (2 allocations: 9.38 KiB)
Threaded:   3.074 ms (54 allocations: 23.03 KiB)
sort!(rand(Int, 1419857))
Default:    39.456 ms (3 allocations: 10.84 MiB)
Paged MS:   111.730 ms (3 allocations: 37.48 KiB)
Threaded:   69.457 ms (57 allocations: 79.28 KiB)

Adds the PagedMergeSort algorithm, a merge sort with O(sqrt n) auxiliary space usage.

- unify function signatures - remove trailing whitespace - rename variables - always use space after comma in fuction definitions/calls - improve comments

LilithHafner

Here are a few inline comments from looking over the parts of the code I understand. I'll need to get a better understanding of block sorts before I can give a full review.

src/SortingAlgorithms.jl

LSchwerdt · 2023-02-12T16:43:22Z

Thank you for taking the time to review this PR. I am looking forward to the critique :).

First, a correctness issue, I ran this script to check stability and found a counterexample:

That was caused by the initial optimizations. Base sort returned a range, while all algorithms from SortingAlgorithms returned a vector. Fixed now after implementing initial optimizations.

Second, some notes on benchmarking, use BenchmarkTools.@btime rather than @time for more accurate results.

This algorithm is most useful for sorting large amounts of data, where @btime uses only one sample anyways. I only included the smaller input sizes to show the memory usage (allocations). But you are right, @btime would have avoided doubts about the timing accuracy.

I also prefer to work with benchmarking code that can be pasted into a REPL

I found Pluto notebooks that load the correct package from github very convenient when benchmarking on different PCs not setup for package development. Run the notebook -> done. But you are right, if you already checked out the branch for review, a simple script is even more convenient. I have added a benchmark script to PR #72.

it would be nice to get similar or better performance with a simpler implementation

There are some ways to simplify the code, but all are less performant:

Use array assignments instead of loops.
Use a reshaped view of the input array (blocksize x nBlocks matrix). The last elements that do not fit would need to be handled seperately, though.
Dont use twoended_merge!

Feel free to suggest improvements in this regard. I will continue trying to simplify, too. But this is the main reason I did not go for an algorithm with O(1) space. While not especially elegant, this PR is much shorter and concenpually simpler than HolyGrailSort, for example.

LilithHafner · 2023-02-12T18:14:55Z

Fixed now after implementing initial optimizations.

I'm still seeing a problem:

julia> PagedMergeSort
Base.Sort.MissingOptimization(
    Base.Sort.BoolOptimization(
        Base.Sort.Small{10}(
            Base.Sort.InsertionSortAlg(),
            Base.Sort.IEEEFloatOptimization(
                SortingAlgorithms.PagedMergeSortAlg()))))

julia> sort(1:100, by=Returns(0), alg=PagedMergeSort)
100-element Vector{Int64}:
   1
   2
   3
   4
   5
   6
   7
  14
  15
  16
   ⋮
  98
  99
 100
  83
  84
  85
  86
  87
  88

LSchwerdt · 2023-02-12T18:17:16Z

I'll need to get a better understanding of block sorts before I can give a full review.

This is not a block sort as defined here. That's why I called it PagedMergeSort and not BlockMergeSort, although block merge sort would be a fitting name.

Btw, is there a good way to make the gif in the OP available in the documentation somehow? I think it is very helpful to understand the merge procedure.

Co-authored-by: Lilith Orion Hafner <[email protected]>

…gesort

LSchwerdt · 2023-03-20T13:33:01Z

I think that the process of moving the pages into sorted order can be simpler: [...]

If I understand it correctly, what you describe is in essence the current solution. But I will try to improve it to make this obvious when looking at the code. And maybe I'll manage to remove the dependency on StaticArrays, by making the control flow explicit, following your suggestion.

Looping through the three free pages at the end of the subarrays is accomplished using freeBlocksIdx. It starts at 3, and after the 3 pages are moved from the buffer, only the first page in the buffer is used for all the following permutation cycles.

The linear scan is tracked with doneBlockIdx. It is initialized to 1, and while true is exited by doneBlockIdx == nBlocks && return when the linear scan is complete.

Update: Success! The dependency is eliminated and I think the code is more clear now.

and always use "page" instead of "block"

and eliminate dependency on StaticArrays

LilithHafner · 2023-04-10T20:25:14Z

Great work! It is much simpler now. I think it can be even simpler, though. I've put some suggestions in a pr to the head branch of this pr because I wanted to make sure that my suggestions worked before recommending them.

refactor PagedMergeSort

LilithHafner

Rename buffer/buf to scratch or to t?

Base.Sort's "scratch" is an implementation detail and will likely change semantics to use buffers once they are a thing, so it's for the best not to integrate too closely with that system.

However, I still think sharing a common name might be a good idea. IIRC, Base uses t and scratch, the (obsolete) radix sort in SotihngAlgorithsm.jl uses ts.

Right now, PagedMergeSort uses both t and buf which brings the total number of names for this concept up to 3 in this repo and 4 in this repo+base. I recommend using scratch as an abbreviation for "scratch space".

CI only tests the fallback to PagedMergeSort for ThreadedPagedMergeSort.

That's not good. Maybe we should split off PagedMergeSort and try to merge it first and separately, and do ThreadedPagedMergeSort in a second PR.

src/SortingAlgorithms.jl

LilithHafner · 2023-05-31T13:20:58Z

src/SortingAlgorithms.jl

+        while_condition1(offset) = (_,_,k) -> k <= offset + pagesize
+        while a < m-pagesize && b < hi-pagesize
+            pages = next_page!(pageLocations, pages, pagesize, lo, a)
+            offset = page_offset(pages.current)
+            a,b,_ = merge!(while_condition1(offset),v,v,v,o,a,b,offset+1)
+        end
+        # merge until either A or B is empty or the last page is reached
+        k, offset = nothing, nothing
+        while_condition2(offset) = (a,b,k) -> k <= offset + pagesize && a <= m && b <= hi
+        while a <= m && b <= hi && pages.currentNumber + 3 < nPages
+            pages = next_page!(pageLocations, pages, pagesize, lo, a)
+            offset = page_offset(pages.current)
+            a,b,k = merge!(while_condition2(offset),v,v,v,o,a,b,offset+1)
+        end


These while loops and while_conditions feel strange to me; I wonder if there is a better approach.

It was the best way to implement this I could think of. But of course I am open to suggestions for implrovement.

src/SortingAlgorithms.jl

use copyto! remove premature variable definitions

LSchwerdt · 2023-08-12T13:49:54Z

Right now, PagedMergeSort uses both t and buf which brings the total number of names for this concept up to 3 in this repo and 4 in this repo+base. I recommend using scratch as an abbreviation for "scratch space".

That's not good. Maybe we should split off PagedMergeSort and try to merge it first and separately, and do ThreadedPagedMergeSort in a second PR.

These are good suggestions imho. I have implemented them in 3f337f0 and d3ea719.

Except for rebasing, my checklist of open questions is now completed.

src/SortingAlgorithms.jl

To work around JuliaLang/julia#50900

Occured for small inputs when not using initial optimizations.

LilithHafner

LGTM!

LilithHafner · 2023-08-18T14:15:36Z

Thank you! This is a nice algorithm to have.

LSchwerdt · 2023-08-19T07:11:19Z

Thank you! This is a nice algorithm to have.

Thank you for your efforts reviewing and massively improving the code.

LSchwerdt added 2 commits February 5, 2023 17:01

Add PagedMergeSort

8a63108

Adds the PagedMergeSort algorithm, a merge sort with O(sqrt n) auxiliary space usage.

Update references

e0ebe94

LSchwerdt added 2 commits February 7, 2023 14:33

Return sorted vectors in sort!

0d4aa65

Make midpoint() backwards compatible

4d6e518

LSchwerdt force-pushed the pagedmergesort branch from d2dd7f7 to 4d6e518 Compare February 7, 2023 15:17

LSchwerdt added 2 commits February 7, 2023 16:19

Use PagedMergeSort when calling ThreadedPagedMergeSort in version < 1.3

58b06fb

Improve style

ce9ad62

- unify function signatures - remove trailing whitespace - rename variables - always use space after comma in fuction definitions/calls - improve comments

LSchwerdt mentioned this pull request Feb 11, 2023

Add PatternDefeatingQuicksort #72

Open

10 tasks

LSchwerdt added 9 commits February 12, 2023 16:23

Add PagedMergeSort

66315f7

Adds the PagedMergeSort algorithm, a merge sort with O(sqrt n) auxiliary space usage.

Update references

df33439

Return sorted vectors in sort!

b33b4e8

Make midpoint() backwards compatible

a5d5cea

Use PagedMergeSort when calling ThreadedPagedMergeSort in version < 1.3

c2ff440

Improve style

2da1f83

- unify function signatures - remove trailing whitespace - rename variables - always use space after comma in fuction definitions/calls - improve comments

Rebase and apply initial optimizations

d6f40db

Remove unused variable in function signatures

a25003e

Add algorithms to readme.md

88e5b22

LSchwerdt force-pushed the pagedmergesort branch from ce9ad62 to 88e5b22 Compare February 12, 2023 15:57

LilithHafner reviewed Feb 12, 2023

View reviewed changes

LSchwerdt and others added 6 commits February 12, 2023 19:17

Rename _midpoint to midpoint

436eb79

Co-authored-by: Lilith Orion Hafner <[email protected]>

Clarify comments in twoended_merge!

dad2a2c

Remove unnecessary @inbounds in twoended_merge!

cd72df0

Co-authored-by: Lilith Orion Hafner <[email protected]>

Explain why twoended_merge! is used in comment

90c0ca6

Co-authored-by: Lilith Orion Hafner <[email protected]>

Use midpoint instead of _midpoint

7841b6a

Merge remote-tracking branch 'LSchwerdt/pagedmergesort' into pagedmer…

2465073

…gesort

LSchwerdt added 3 commits March 22, 2023 19:22

Replace getNextBlock! macro by function

bb4e706

and always use "page" instead of "block"

Reduce number of arguments of next_page!()

0d9e23f

Refactor page permutation

cb89f11

and eliminate dependency on StaticArrays

LSchwerdt force-pushed the pagedmergesort branch from a880a05 to cb89f11 Compare March 23, 2023 19:13

Lilith's refactor

5bcdbb8

Lilith Hafner and others added 2 commits May 28, 2023 07:16

Apply @LSchwerdt's suggestions

5dab2a1

Merge pull request #1 from LilithHafner/pagedmergesort-lilith-refactor

bfbe0cf

refactor PagedMergeSort

LilithHafner reviewed May 31, 2023

View reviewed changes

LSchwerdt added 7 commits August 11, 2023 15:05

Simplify check for input lengths in paged_merge!

71fc310

Simplifcations

2c9ef24

use copyto! remove premature variable definitions

Fix indentation

4bbc185

Remove superfluous check for small inputs in threaded_pagedmergesort!

3a8a53b

Improve formatting

a783092

Remove ThreadedPagedMergesort

d3ea719

Use the name "scratch" instead of "buf" or "t"

3f337f0

Merge branch 'master' into pagedmergesort

56a609c

LilithHafner mentioned this pull request Aug 13, 2023

copyto! sometimes copies whole source array when there is possibility of aliasing JuliaLang/julia#50900

Open

LilithHafner reviewed Aug 13, 2023

View reviewed changes

src/SortingAlgorithms.jl Outdated Show resolved Hide resolved

LilithHafner reviewed Aug 13, 2023

View reviewed changes

src/SortingAlgorithms.jl Outdated Show resolved Hide resolved

LSchwerdt added 2 commits August 14, 2023 14:04

Use custom _unsafe_copyto! instead of copyto!

d2ea0a7

To work around JuliaLang/julia#50900

Fix: pageLocations is initialized with a negative number of elements

717f9ae

Occured for small inputs when not using initial optimizations.

LilithHafner approved these changes Aug 15, 2023

View reviewed changes

Add test for small input without initial optimizations

d367ee8

LilithHafner merged commit b668837 into JuliaCollections:master Aug 18, 2023
16 checks passed

LilithHafner added the new algorithm label Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PagedMergeSort, a merge sort using O(√n) space #71

Add PagedMergeSort, a merge sort using O(√n) space #71

LSchwerdt commented Feb 6, 2023 •

edited

Loading

codecov-commenter commented Feb 7, 2023 •

edited

Loading

LilithHafner commented Feb 11, 2023

LilithHafner left a comment

LSchwerdt commented Feb 12, 2023

LilithHafner commented Feb 12, 2023

LSchwerdt commented Feb 12, 2023

LSchwerdt commented Mar 20, 2023 •

edited

Loading

LilithHafner commented Apr 10, 2023

LilithHafner left a comment •

edited

Loading

LilithHafner May 31, 2023

LSchwerdt Aug 12, 2023

LSchwerdt commented Aug 12, 2023

LilithHafner left a comment

LilithHafner commented Aug 18, 2023

LSchwerdt commented Aug 19, 2023

Add PagedMergeSort, a merge sort using O(√n) space #71

Add PagedMergeSort, a merge sort using O(√n) space #71

Conversation

LSchwerdt commented Feb 6, 2023 • edited Loading

Checklist before merging

Some notes

Benchmarks

codecov-commenter commented Feb 7, 2023 • edited Loading

Codecov Report

LilithHafner commented Feb 11, 2023

LilithHafner left a comment

Choose a reason for hiding this comment

LSchwerdt commented Feb 12, 2023

LilithHafner commented Feb 12, 2023

LSchwerdt commented Feb 12, 2023

LSchwerdt commented Mar 20, 2023 • edited Loading

LilithHafner commented Apr 10, 2023

LilithHafner left a comment • edited Loading

Choose a reason for hiding this comment

LilithHafner May 31, 2023

Choose a reason for hiding this comment

LSchwerdt Aug 12, 2023

Choose a reason for hiding this comment

LSchwerdt commented Aug 12, 2023

LilithHafner left a comment

Choose a reason for hiding this comment

LilithHafner commented Aug 18, 2023

LSchwerdt commented Aug 19, 2023

LSchwerdt commented Feb 6, 2023 •

edited

Loading

codecov-commenter commented Feb 7, 2023 •

edited

Loading

LSchwerdt commented Mar 20, 2023 •

edited

Loading

LilithHafner left a comment •

edited

Loading