Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priority queue iterator - [MOD-5529] #406

Merged
merged 9 commits into from
Aug 8, 2023
Merged

Conversation

GuyAv46
Copy link
Collaborator

@GuyAv46 GuyAv46 commented Jul 16, 2023

Describe the changes in the pull request

This PR exposes a random order iterator for the max_priority_queue, and utilizes it in the neighbor choosing heuristic in HNSW. This improvement reduces heap usage upon insertion and deletion.

As a side effect, we no longer prefer the candidate with the larger internal id on equal distance candidates, and we choose "randomly" (we sort the candidates by score alone).

Main objects this PR modified

  1. Expose begin() and end() of max_priority_queue
  2. using it on HNSW edges choosing flows

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

@@ -1655,140 +1655,6 @@ TYPED_TEST(HNSWTest, testInitialSizeEstimation_No_InitialCapacity) {
VecSimIndex_Free(index);
}

TYPED_TEST(HNSWTest, testIncomingEdgesSize) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is no longer needed as we now have no guarantee on the behavior of choosing between candidates with equal distance (aka choosing at random).

@GuyAv46 GuyAv46 requested a review from meiravgri August 3, 2023 08:34
@GuyAv46 GuyAv46 force-pushed the guyav-priority_queue_iterator branch from e14aafd to 5753227 Compare August 3, 2023 08:35
@GuyAv46 GuyAv46 marked this pull request as ready for review August 3, 2023 08:43
This was referenced Aug 3, 2023
@codecov
Copy link

codecov bot commented Aug 3, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.06% ⚠️

Comparison is base (54990ce) 95.66% compared to head (342563b) 95.61%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #406      +/-   ##
==========================================
- Coverage   95.66%   95.61%   -0.06%     
==========================================
  Files          66       66              
  Lines        4180     4174       -6     
==========================================
- Hits         3999     3991       -8     
- Misses        181      183       +2     
Files Changed Coverage Δ
src/VecSim/algorithms/hnsw/hnsw.h 98.32% <100.00%> (-0.02%) ⬇️
src/VecSim/utils/vecsim_stl.h 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool!
Removing this confusing -distance was very necessary :)
much cleaner
Had some comment see below

src/VecSim/utils/vecsim_stl.h Show resolved Hide resolved
src/VecSim/algorithms/hnsw/hnsw.h Outdated Show resolved Hide resolved
src/VecSim/algorithms/hnsw/hnsw.h Outdated Show resolved Hide resolved
src/VecSim/algorithms/hnsw/hnsw.h Outdated Show resolved Hide resolved
src/VecSim/algorithms/hnsw/hnsw.h Outdated Show resolved Hide resolved
@GuyAv46 GuyAv46 requested a review from meiravgri August 6, 2023 12:06
@GuyAv46 GuyAv46 changed the title Priority queue iterator Priority queue iterator - [MOD-5529] Aug 7, 2023
@GuyAv46 GuyAv46 merged commit 3f25aa2 into main Aug 8, 2023
22 of 23 checks passed
@GuyAv46 GuyAv46 deleted the guyav-priority_queue_iterator branch August 8, 2023 14:40
GuyAv46 added a commit that referenced this pull request Aug 8, 2023
* changed implementation of max heap to have random order iterator

* changed heuristic use to use heap iterator

* fix sort

* bug fix (flip flag)

* reserving and using ranges

* removing a no-longer relevant test

* making heap iterator const

* remove the usage of ranges (TODO: return in the future)

* review fixes
GuyAv46 added a commit that referenced this pull request Aug 9, 2023
* changed implementation of max heap to have random order iterator

* changed heuristic use to use heap iterator

* fix sort

* bug fix (flip flag)

* reserving and using ranges

* removing a no-longer relevant test

* making heap iterator const

* remove the usage of ranges (TODO: return in the future)

* review fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants