Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use parallelQuickSort() to sort Hits as well? #410

Open
jan-niestadt opened this issue Mar 24, 2023 · 0 comments
Open

Use parallelQuickSort() to sort Hits as well? #410

jan-niestadt opened this issue Mar 24, 2023 · 0 comments
Labels
performance Related to performance or load management
Milestone

Comments

@jan-niestadt
Copy link
Member

IntArrays.parallelQuickSort() really sped up sorting terms while writing them to disk or while reconstructing the global terms list.

Sorting hits by a HitProperty seems to still be done single-threaded in the HitsInternal* classes. We should test if using parallelQuickSort() speeds this up. This method will fall back to regular quickSort() if there's fewer than 8192 elements, so should always be faster.

The only way it could actually end up being slower is if any locking occurs. For example, Collator.compare() is a synchronized method, so using that in a Comparator will slow down parallel sorts by a lot (confirmed using the TestSortPerformance utility). This could be counteracted by precalculating a value that can be compared without locking (a CollationKey in this example).

@jan-niestadt jan-niestadt added the performance Related to performance or load management label Mar 24, 2023
@jan-niestadt jan-niestadt added this to the v4.0 milestone Mar 24, 2023
@jan-niestadt jan-niestadt modified the milestones: v4.0, v5.0 Feb 8, 2024
@jan-niestadt jan-niestadt removed the cache label Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to performance or load management
Projects
None yet
Development

No branches or pull requests

1 participant