Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Efficiently finds the unique columns, rows, etc. of an array. The algorithm first hashes along the specified dimension, then finds the unique hashes, and finally checks that the hashes don't collide. It is roughly O(n) in the number of elements in the array.
Compared with MATLAB's
unique(x, 'rows')
, which first sorts the rows, this approach gives much more consistent and almost always better performance. It is ~10% slower for MATLAB's best case (when values are random, and so sorting requires inspecting only a single column) and much faster for other cases (it is 25x faster than MATLAB for 500 repeats of the same 10 random rows with 5000 columns).This is my first time using Cartesian. Without it, this code is presently about 10% faster for finding unique rows of a matrix, but the overhead is probably worth it for the generality.