You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Integrating with redis or other external storage layer is definitely possible. However I would consider the issue of I/O cost with external storage -- sets of original data and posting lists (the data structured used in this library) can be much bigger than MinHash and LSH, so a Python compute layer + Redis/Cassandra storage layer may be inefficient due to large number of I/Os. A more efficient implementation needs to consider the costs, adding a lot of complexity. I do have an algorithm to solve this problem (JOSIE, VLDB 2019, Github), but I haven't had time to write a production-ready library for this.
Is there any possibility of integration using redis or cassandra as already Minhash LSH has?
The text was updated successfully, but these errors were encountered: