query performance tuning #3556
Replies: 2 comments
-
|
Beta Was this translation helpful? Give feedback.
-
@mdouze |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
@mdouze |
Beta Was this translation helpful? Give feedback.
-
Summary
Hi. I would like to get some understanding on the search performance, I've noticed doc here :
https://github.com/facebookresearch/faiss/wiki/Threads-and-asynchronous-calls
and I realized the faiss internal threading is done by OpenMP.
I have two questions here:
question 1:
when I call index.search() on a batch queries (a numpy matrix), the OpenMP will do pthread_create() on the fly or it will search for an available thread in existing thread-pool? creating short-lived thread on each search call seems a big overhead?
I was trying to use strace to print all kernel calls, but didn't find any "pthread_create" under the search process and its subprocesses. maybe is it in another process?
I do see some clone() linux function called, does that mean new subprocesses are created to do the search?
Question 2:
I need to clarify on the sentence "However it is very inefficient to call batches of queries from multiple threads, this will spawn more threads than CPU cores."
How can I make sure that the number of threads created <= CPU cores?
For example, Assuming I have 2 threads calling index.search(),and I set OMP_NUM_THREADS=4, does that mean 8 is the max number of threads OPENMP will create to do the search?
Running on:
Interface:
Beta Was this translation helpful? Give feedback.
All reactions