introduce options for reducing the overhead for a clustering procedure#790
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #790 +/- ##
=========================================
+ Coverage 0 62.89% +62.89%
=========================================
Files 0 84 +84
Lines 0 6094 +6094
=========================================
+ Hits 0 3833 +3833
- Misses 0 2261 +2261 |
liliu-z
left a comment
There was a problem hiding this comment.
Cardinal is still using similar rand_perm for clustering. Do we need to optimize there?
| std::vector<int> perm; | ||
| if (clus.use_faster_subsampling) { | ||
| // use subsampling with splitmix64 rng | ||
| SplitMix64RandomGenerator rng(actual_seed); |
There was a problem hiding this comment.
should we call something like rand_perm_splixMix64 here, which is the same as the else block
There was a problem hiding this comment.
I'd like to bring this PR as is, because it a clone of changes from faiss baseline
|
@alexanderguzhva plz take care of the e2e test |
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
110465d to
22d7513
Compare
|
rebased to master |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexanderguzhva, liliu-z The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
faiss baseline PR facebookresearch/faiss#3731
/kind improvement