Skip to content

SearchParameters support for IndexBinaryFlat#4055

Closed
gustavz wants to merge 27 commits intofacebookresearch:mainfrom
gustavz:gustavz/search_params_support_for_index_binary_flat
Closed

SearchParameters support for IndexBinaryFlat#4055
gustavz wants to merge 27 commits intofacebookresearch:mainfrom
gustavz:gustavz/search_params_support_for_index_binary_flat

Conversation

@gustavz
Copy link

@gustavz gustavz commented Dec 4, 2024

Context issue: #3503

We need search params support for binary flat index to be able to use it in RAG applications that support search with pre-filtering.

@facebook-github-bot
Copy link
Contributor

Hi @gustavz!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@gustavz gustavz changed the title Search params support for IndexBinaryFlat SearchParameters support for IndexBinaryFlat Dec 4, 2024
@gustavz
Copy link
Author

gustavz commented Dec 11, 2024

@mnorris11 whats the process of getting this PR reviewed and merged?

@mnorris11
Copy link

@gustavz We triaged it internally (the implementation tag), so someone will review it soon. We have an internal task tracking its review.

Copy link
Contributor

@asadoughi asadoughi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request! It looks like the PR is missing the changes to hamming.h for the additional IDSelector argument. Could you add those, please?

@gustavz gustavz force-pushed the gustavz/search_params_support_for_index_binary_flat branch from 2717b86 to d70e64b Compare December 12, 2024 08:23
@gustavz gustavz requested a review from asadoughi December 12, 2024 08:23
Copy link
Contributor

@asadoughi asadoughi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just a few more tweaks

@gustavz gustavz force-pushed the gustavz/search_params_support_for_index_binary_flat branch from 539dfae to cd8b5d3 Compare December 16, 2024 09:00
@gustavz gustavz requested a review from asadoughi December 16, 2024 09:13
@gustavz gustavz requested a review from asadoughi December 24, 2024 08:07
@mnorris11
Copy link

Hi @gustavz seems like there is a small compilation error, it's asking for faiss::IDSelector instead of IDSelector?

@gustavz
Copy link
Author

gustavz commented Jan 14, 2025

sorry was on vacation over the new year, pls re-review @asadoughi @mnorris11

@mnorris11 mnorris11 dismissed asadoughi’s stale review February 28, 2025 17:02

changes were made. Clicking "dismiss" to try to bring back Import to fbsource button...

@facebook-github-bot
Copy link
Contributor

@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@gustavz
Copy link
Author

gustavz commented Mar 4, 2025

Anything to do on my side @mnorris11 ?

@gtwang01
Copy link
Contributor

gtwang01 commented Mar 6, 2025

Hey Gustav - can you fix the lint errors and update the branch? I'll help you merge this PR from there.

@gustavz
Copy link
Author

gustavz commented Mar 7, 2025

@gtwang01 I cant see the tests
Screenshot 2025-03-07 at 08 29 15

@gustavz
Copy link
Author

gustavz commented Mar 10, 2025

Hey Gustav - can you fix the lint errors and update the branch? I'll help you merge this PR from there.

Please provide info how I can reproduce/see the lint errors locally @gtwang01

@gtwang01
Copy link
Contributor

I'll just provide them here:

Trailing whitespace
fbcode/faiss/tests/test_search_params.py
Line 26
Line 27
Line 28
Line 29
Line 30

Blank line contains whitespace
fbcode/faiss/tests/test_search_params.py
Line 154
Line 159
Line 169
Line 171
Line 183
Line 190
Line 192

@gustavz
Copy link
Author

gustavz commented Mar 11, 2025

@gtwang01 I think it's good to merge now

@gtwang01 gtwang01 requested review from gtwang01 and removed request for mnorris11 March 11, 2025 21:06
@gtwang01 gtwang01 self-assigned this Mar 11, 2025
@gustavz
Copy link
Author

gustavz commented Mar 13, 2025

@gtwang01 can you explain the failing windows build? I see following issues and all seem unrelated to my PR, can you double check?

  • 2416: [[maybe_unused]] cannot be applied → AutoTune.cpp (344, 347, 354, 358, 376, 380, 385, 397) → Likely used in an invalid context.
  • C2440: const Type* to Type* conversion fails → AutoTune.cpp (347, 354, 358, 376, 380, 385, 397) → const qualifier is being removed improperly.
  • C4804: Unsafe use of bool with >= → IndexHNSW.cpp (127) → Comparison involves a bool, possibly misused.
  • C4477: printf format mismatch → index_factory.cpp (769) → %ld used for unsigned __int64, should use %llu or %zd.
  • C4267: size_t conversion may lose data → HNSW.cpp (various lines) → Implicit conversion from size_t to a smaller integer type.

@gtwang01
Copy link
Contributor

Can you try merging on top of #4238?

Copy link
Contributor

@gtwang01 gtwang01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for following up and making requested changes.

@facebook-github-bot
Copy link
Contributor

@gtwang01 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@gustavz
Copy link
Author

gustavz commented Mar 17, 2025

@gtwang01 unfortunately I can't investigate the interal errors. So I need your guidance to solve them 🙏

@facebook-github-bot
Copy link
Contributor

@gtwang01 merged this pull request in fec7ce9.

@metonymic-smokey
Copy link

I think it would be useful to extend support for this to the C API too?
If you think it would be useful, please let me know and I can help contribute.
Thanks!

@mnorris11
Copy link

I think it would be useful to extend support for this to the C API too? If you think it would be useful, please let me know and I can help contribute. Thanks!

Sure, feel free to open a PR and we can take a look

abhinavdangeti added a commit to blevesearch/faiss that referenced this pull request May 20, 2025
…iss@bleve` (#52)

Merge results:
```
|\
* ca874b6 Abhinav Dangeti | Fix type mismatches within unit test: TEST(TestHamming, test_hamming_knn)
* e255b9b Abhinav Dangeti | Adapt signature change of `get_InvertedListScanner` in faiss/IndexIVFPQ.cpp
* 90fe29b Abhinav Dangeti | Remove redundant cmake install over target `faiss_c`
*   0882dd3 Abhi Dangeti | Merge branch 'bleve' into main_1.11.0
|\
| * 0be294a Deepkaran Salooja | Implement compute_distance_to_codes_for_list and compute_distance_table for IndexIVFPQ (#50)
| * 352484e Rahul Rampure | MB-65473: Batch converter for vector to cluster IDs (#49)
| * 14a4a60 Rahul Rampure | MB-65473: Refactor and Optimize Pre-Filtered Vector Search (#48)
| *   b4cc942 Abhi Dangeti | MB-65243: Merge 'facebookresearch/faiss@v1.10.0' into 'blevesearch/faiss@bleve' (#46)
| |\
| | * 8d33b5c Abhinav Dangeti | MB-65243: Merge 'facebookresearch/faiss@v1.10.0' into 'blevesearch/faiss@bleve'
| |/|
| * | 8eecdb6 Rahul Rampure | MB-63643: Fix missing num_threads clauses (#44)
| * | 224acef Deepkaran Salooja | MB-61093 Fix memory leak for SQDistanceComputer (#43)
| * | 3001b51 Deepkaran Salooja | MB-61093 Add method to compute distance from codes for IVF index (#41)
| * | b747c55 Aditi Ahuja | MB-62230 - Updated closest_centroids API to include params (#38)
| * | 26d9b35 Aditi Ahuja | MB-62230 - Extended c_api to search only specified clusters with params. (#35)
| * | f077bf9 Abhi Dangeti | Build libfaiss with AVX2 support when requested, rather than libfaiss (#37)
| * |   5ab1ce0 Abhi Dangeti | MB-62577: Merge 'facebookresearch/faiss@v1.8.0' into blevesearch/faiss@bleve
| |\ \
| | * | 3306e58 Abhinav Dangeti | MB-62577: Merge 'facebookresearch/faiss@v1.8.0' into blevesearch/faiss@bleve
| |/| |
| * | | d9db66a Rahul Rampure | MB-62221: API to free a buffer allocated in C runtime (#30)
| * | | a2f4183 Rahul Rampure | MB-62221: Fix buffer overflow (#29)
| * | | 7977457 Rahul Rampure | MB-61930: Add a num_threads clause to every openMP pragma. (#25)
| * | | a30eaa2 Rahul Rampure | MB-61930: Optimize Thread Management in High Throughput Scenarios (#24)
| * | | 2ce3883 Thejas-bhat | MB-59575: Revert memcpy optimizations for flat indexes (#23)
| * | | 7c3c7d1 Thejas-bhat | MB-59575: Refactor member variables alignment of IndexFlatCodes (#22)
| * | | 17c3992 Thejas-bhat | MB-59575: Reducing copy overhead of already memory mapped content (#17)
| * | | 38f6b60 Chris Hillery | Fix build on Windows (#21)
| * | | 4143984 SaptarshiSen-CB | MB-61609: Fix zero sa_code_size (#19)
| * | | 7b119f4 Rahul Rampure | MB-60739: Fix integer overflow (#15)
| * | | 6851683 Rahul Rampure | MB-60657: Fix integer overflow (#14)
| * | | 8672bf3 Thejas-bhat | Size API to get the index's size (#13)
| * | | b34ccf6 Aditi Ahuja | MB-60202 - IDMap2 Selector (#12)
| * | | a623ec6 Thejas-bhat | Introducing a new reader to read index using a pointer (#8)
| * | | 4dd26f8 Chris Hillery | Add INSTALL() directive for faiss_c (#7)
| * | | 14fd16a Chris Hillery | Suppress (thousands of) warnings when building with GCC (#6)
| * | | 44febf0 Abhi Dangeti | Address incorrect import within c_api/IndexIVF_c_ex.cpp (#5)
| * | | 1b295e4 Abhi Dangeti | Add build instructions for IndexIVF_c_ex.cpp and Index_c_ex.cpp (#4)
| * | | 334021a Abhi Dangeti | additional index APIs (#3)
| * | | f0bbc06 Abhi Dangeti | Introducing index IO operations over char buffer (#2)
* | | | ea1cdf0 Michael Norris | Increment next release, v1.11.0 (facebookresearch#4308)
* | | | 70c4537 simshi | fix: algorithm of spreading vectors over shards (facebookresearch#4299)
* | | | d4fa401 Michael Norris | Add RaBitQ to the swigfaiss so we can access its properties correctly in python (facebookresearch#4304)
* | | | c75f166 Satyendra Mishra | Add date and time to the codec file path so that the file doesn't get overridden with each run (facebookresearch#4303)
* | | | a3cd63f Aditya Vidyadhar Kamath | Skip mmap test case in AIX. (facebookresearch#4275)
* | | | e36897f Michael Norris | Fix overflow of int32 in IndexNSG (facebookresearch#4297)
* | | | 117aafd Michael Simpson | Fix Type Error in Conditional Logic (facebookresearch#4294)
* | | | 928333c Jim Meyering | faiss/gpu/GpuAutoTune.cpp: fix llvm-19-exposed -Wunused-but-set-variable warnings
* | | | bb04bf6 Bhavik Sheth | Add missing header in faiss/CMakeLists.txt (facebookresearch#4285)
* | | | d9cfd00 Satyendra Mishra | Implement is_spherical and normalize_L2 booleans as part of the training APIs (facebookresearch#4279)
* | | | 915f719 Michael Norris | Fix nightly by pinning conda-build to prevent regression in 25.3.2 (facebookresearch#4287)
* | | | de5e85e generatedunixname89002005287564 | Fix CQS signal. Id] 88153895 -- readability-redundant-string-init in fbcode/faiss (facebookresearch#4283)
* | | | 7eac034 Satyendra Mishra | Add normalize_l2 boolean to distributed training API
* | | | 0dfb599 Jaap Aarts | Handle insufficient driver gracefully (facebookresearch#4271)
* | | | d4e236b Alexandr Guzhva | relax input params for IndexIVFRaBitQ::get_InvertedListScanner() (facebookresearch#4270)
* | | | df9e2c4 Alexandr Guzhva | Fix a placeholder for 'unimplemented' in mapped_io.cpp (facebookresearch#4268)
* | | | 0d3aff9 wwq | fix bug: IVFPQ of raft/cuvs does not require redundant check (facebookresearch#4241)
* | | | a4401c1 Kaival Parikh | Allow using custom index readers and writers (facebookresearch#4180)
* | | | 636d95e Tarang Jain | Upgrade to libcuvs=25.04 (facebookresearch#4164)
* | | | 7f523f0 Junjie Qi | ignore regex (facebookresearch#4264)
* | | | ccc2b33 Alexandr Guzhva | fix a serialization problem in RaBitQ (facebookresearch#4261)
* | | | 13255a8 Kaival Parikh | Publish the C API to Conda (facebookresearch#4186)
* | | | 3a49130 Alexandr Guzhva | RaBitQ implementation (facebookresearch#4235)
* | | | c2fc549 Satyendra Mishra | Pass row filters to Hive Reader to filter rows (facebookresearch#4256)
* | | | 6116d36 Mayank Bhatia | Grammar fix in FlatIndexHNSW (facebookresearch#4253)
* | | | 1debb7d Matthijs Douze | re-land mmap diff (facebookresearch#4250)
* | | | 0f2035c Richard Barnes | Fix CUDA kernel index data type in faiss/gpu/impl/DistanceUtils.cuh +10 (facebookresearch#4246)
* | | | 1dcbb4a Alexandr Guzhva | fix `IVFPQFastScan::RangeSearch()` on the `ARM` architecture (facebookresearch#4247)
* | | | 8bce244 Mengdi Lin | fix integer overflow issue when calculating imbalance_factor (facebookresearch#4245)
* | | | 5adab67 Rohil Shah | Fix bug with metric_arg in IndexHNSW (facebookresearch#4239)
* | | | f2f7a66 Mengdi Lin | Back out "test merge with internal repo" (facebookresearch#4244)
* | | | caa5f24 Junjie Qi | test merge with internal repo (facebookresearch#4242)
* | | | 9e808d4 Richard Barnes | Remove unused exception parameter from faiss/impl/ResultHandler.h (facebookresearch#4243)
* | | | fec7ce9 Gustav von Zitzewitz | SearchParameters support for IndexBinaryFlat (facebookresearch#4055)
* | | | df6a8f6 George Wang | Address compile errors and warnings (facebookresearch#4238)
* | | | 15491a1 Saumya Agarwal | Revert D69972250: Memory-mapping and Zero-copy deserializers
* | | | fbc7db2 Saumya Agarwal | Revert D69984379: mem mapping and zero-copy python fixes
* | | | 631b0fd Matthijs Douze | mem mapping and zero-copy python fixes (facebookresearch#4212)
* | | | 55a3c2a Alexandr Guzhva | Memory-mapping and Zero-copy deserializers (facebookresearch#4199)
* | | | 653be59 Richard Barnes | Use `nullptr` in faiss/gpu/StandardGpuResources.cpp (facebookresearch#4232)
* | | | 3d96ad5 Lucian Grijincu | faiss: fix non-templated hammings function (facebookresearch#4195)
* | | | 4cd2f6e Junjie Qi | Support non-partition col and map in the embedding reader (facebookresearch#4229)
* | | | a22ec32 Junjie Qi | Support cosine distance for training vectors (facebookresearch#4227)
* | | | c109174 Richard Barnes | Fix LLVM-19 compilation issue in faiss/AutoTune.cpp (facebookresearch#4220)
* | | | 615c17e Shuyao Qi | Add missing #include in code_distance-sve.h (facebookresearch#4219)
* | | | eab52af Tom Jackson | Fix cloning and reverse index factory for NSG indices (facebookresearch#4151)
* | | | 1a295cd George Wang | Remove python_abi to fix nightly (facebookresearch#4217)
* | | | 4cea80b Shuyao Qi | Make static method in header inline (facebookresearch#4214)
* | | | 835b3ea Michael Norris | Fix IVF quantizer centroid sharding so IDs are generated (facebookresearch#4197)
* | | | 65222b3 Michael Norris | Pin lief to fix nightly (facebookresearch#4211)
* | | | 7cb4556 lkuffo | Fix Sapphire Rapids never loading in Python bindings (facebookresearch#4209)
* | | | 20c7ca3 Michael Norris | Upgrade openblas to 0.3.29 for ARM architectures (facebookresearch#4203)
* | | | 55d022f George Wang | Attempt to nightly fix (facebookresearch#4204)
* | | | 00ce0e2 Navneet Verma | Add the support for IndexIDMap with Cagra index (facebookresearch#4188)
* | | | 1fe8b8b Nicolas De Carli | Remove unused variable (facebookresearch#4205)
* | | | 6b65289 Divye Gala | Pass `store_dataset` argument along to cuVS CAGRA (facebookresearch#4173)
* | | | d72d0ca Michael Norris | Fix nightly by installing earlier version of lief (facebookresearch#4198)
* | | | 657c563 Bhavik Sheth | Add bounds checking to hnsw nb_neighbors (facebookresearch#4185)
* | | | f0e3832 George Wang | Check for not completed
* | | | aff6bfc Michael Norris | Add sharding convenience function for IVF indexes (facebookresearch#4150)
* | | | 1d8f393 Kaival Parikh | Handle plain SearchParameters in HNSW searches (facebookresearch#4167)
* | | | c6adc01 Michael Norris | Update INSTALL.md to remove some raft references, add missing dependency (facebookresearch#4176)
* | | | 95955d8 Kota Yamaguchi | Fix install error when building avx512_spr variant (facebookresearch#4170)
* | | | d720155 Amir Sadoughi | Update README.md (facebookresearch#4169)
* | | | 9896beb simshi | fix: gpu tests link failure with static lib (facebookresearch#4137)
* | | | 6c04699 Mulugeta Mammo | Fix the order of parameters in bench_scalar_quantizer_distance. (facebookresearch#4159)
* | | | 3ec2fbd Tarang Jain | Update CAGRA docs (facebookresearch#4152)
* | | | 6718dae Kaival Parikh | Expose IDSelectorBitmap in the C_API (facebookresearch#4158)
* | | | 9bc4b67 Jesper Stemann Andersen | Added support for building for MinGW, in addition to MSVC (facebookresearch#4145)
| |_|/
|/| |
```
@metonymic-smokey
Copy link

metonymic-smokey commented May 21, 2025 via email

samanthawaters8882michaeldonovan added a commit to samanthawaters8882michaeldonovan/faiss that referenced this pull request Oct 12, 2025
Summary:
Context issue: facebookresearch/faiss#3503

We need search params support for binary flat index to be able to use it in RAG applications that support search with pre-filtering.

Pull Request resolved: facebookresearch/faiss#4055

Reviewed By: junjieqi

Differential Revision: D69538514

Pulled By: gtwang01

fbshipit-source-id: 4b6811fd8323b4c39e726b7fd33dfe0384dd57fc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants