-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Brute force search without an index
In the case of IndexFlat
, the added data is just copied to the index without further encoding or organization.
Therefore, it may be useful to short-circuit the indexing altogether.
This makes it possible to:
-
avoid having two copies of the same data in memory
-
do updates on the data between searches.
Faiss provides low-level functions to do the brute-force search in this context.
The functions take a matrix of database vectors and a matrix of query vectors and return the k-nearest neighbors and their distances.
On CPU, the relevant function is knn_L2sqr
or knn_inner_product
, see utils/distances.h
In Python, these functions are exposed as contrib.exhaustive_search.knn
, see tests/test_contrib.py.
On GPU, the relevant function is bfKnn
.
An additional advantage is that it takes both CPU and GPU resident data as input or output.
Note that on GPU, the synchronization is needed because Faiss uses a non-default CUDA stream. The easiest workaround is to force it to use the default stream. This is done via
res = faiss.StandardGpuResources()
res.setDefaultNullStreamAllDevices()
Also, by default the amount of memory allocated by the resource object is too large for simple brute force. You can set:
res.setTempMemory(64 * 1024 * 1024)
ie. use only 64M (if that is not enough, try a few other values, according to some reports setting to 0 works just fine).
In Python there are two function wrappers that expose this functionality:
-
for numpy data, use
contrib.exhaustive_search.knn_gpu
, tested in gpu/test_contrib.py -
for pytorch data, use
contrib.pytorch_tensors.search_raw_array_pytorch
An example usage in python, from pytorch GPU data is here: test_pytorch_faiss.py
This demonstrates how to do brute force search without using Faiss at all:
-
In numpy demo_numpy_knn.ipynb
-
in pytorch (GPU optional) demo_pytorch_knn.ipynb.
When the number of query vectors is limited, the best indexing method is to not index at all and use brute force search.
This can be problematic when the dataset vectors do not fit in RAM or GPU RAM.
In that case, it is efficient to search the xq
vectors in slices xb
of the database vectors.
To maintain the top-k nearest neighbors seen so far, use a ResultHeap
structure:
Faiss building blocks: clustering, PCA, quantization
Index IO, cloning and hyper parameter tuning
Threads and asynchronous calls
Inverted list objects and scanners
Indexes that do not fit in RAM
Brute force search without an index
Fast accumulation of PQ and AQ codes (FastScan)
Setting search parameters for one query
Binary hashing index benchmark