This repository has been archived by the owner on Aug 31, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 145
What is the impact of using dense vectors? #33
Comments
Hi Johann, Thanks for the question. It should work just fine with dense
vectors. The algorithm design choices are optimized towards sparse vector
assumptions but it can work on dense vectors too. I would note that there
are very good libraries for dense vector search that should beat this
approach (fiass, annoy).
…On Fri, Mar 5, 2021 at 9:15 AM Johann Petrak ***@***.***> wrote:
From a very quick test with a small index, this seems to work well with
dense vectors (I tried d=300), but is there any specific impact of using
dense vectors on performance for building or searching the index?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#33>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC4WXK7B7OAU7EYDZCHTG3TCDYPRANCNFSM4YVMKCGQ>
.
|
Thank you! Thanks also for pointing out those other libraries -- I already looked at annoy but unlike pysparnn, annoy does not seem to support adding to an index. At least this is not documented anywhere. Fiass does look very interesting though! Is there a rough estimate for how pysparnn would compare to Fiass with regard to performance and precision/recall? |
If I recall correctly- FIASS has a hierarchical small world graph
implementation that should do very well on speed, precision, and recall.
That implementation should work on both dense and sparse vectors. Not sure
on incremental updates to that specific implementation. I would check their
benchmarks. For dense vectors I remember annoy being at least 2 x faster
than pysparnn (I could be wrong) and I think fiass has benchmarks comparing
to annoy.
I would expect that library to do significantly better than this one on
speed and accuracy. Definitely worth trying out.
…On Fri, Mar 5, 2021 at 10:02 AM Johann Petrak ***@***.***> wrote:
Thank you!
Thanks also for pointing out those other libraries -- I already looked at
annoy but unlike pysparnn, annoy does not seem to support adding to an
index. At least this is not documented anywhere.
Fiass does look very interesting though! Is there a rough estimate for how
pysparnn would compare to Fiass with regard to performance and
precision/recall?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#33 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC4WXOM4MLLTV65GTQSCKTTCD6DBANCNFSM4YVMKCGQ>
.
|
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
From a very quick test with a small index, this seems to work well with dense vectors (I tried d=300), but is there any specific impact of using dense vectors on performance for building or searching the index?
The text was updated successfully, but these errors were encountered: