-
Notifications
You must be signed in to change notification settings - Fork 318
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This improves matching performance by using a trie to store the paths. Hit-or-miss matching is then done iteratively, allowing the pruning of subtrees and requiring less work to identify matches. Additionally this structure usually uses much less memory as common prefixes of paths are only stored once. Testing on benchmarks and use cases shows a 2-10x performance improvement for common scenarios. For some edge cases effectively infinite speedup can be seen as huge numbers of paths can not be searched. This is particularly common with a lot of hidden paths. One side note is that I kept the threading feature in although in practice it seems to provide a very small improvement and only on very large sets (>1M). Future work will to investigate why there is little benefit and either fix or remove the threading. (Removing provides a small performance boost as some checks can be removed). Other notes: - A tiny cache boost might be gained by putting all strings into the same buffer Instead of strdup'ing strings separately. - I considered putting the path segments inline into the paths_t structure but it performed worse. This was surprising but probably due to good cache prediction on the strings being in order in memory. - Most of the time is still spent in calculate_match. While we call this function much less now (on every match rather then every string) it is still expensive. While this might be acceptable because its complexity provides a useful result order it is a good optimization target. - Another boost might be gained by post-processing the paths into a single array that stores the delta from the previous. This would give excellent cache-locality but would make skipping over hidden or mask-failing files difficult/impossible. This would also be difficult/impossible to do threaded.
- Loading branch information
Showing
6 changed files
with
636 additions
and
389 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.