You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am beginning with this API. My use case is as follow:
In a large file, made of 100 millions of lines, I would like to get rid of all lines that have a Jaccard > 0.7 (for instance)
I looped once with MinHask.bulk to store the hashes.
Then I double loop to compare line by line => very slow.
same question with File1 compared to File2.
Is there a faster way to accomplish this ?
Thanks
The text was updated successfully, but these errors were encountered:
Hello,
I am beginning with this API. My use case is as follow:
I looped once with MinHask.bulk to store the hashes.
Then I double loop to compare line by line => very slow.
Is there a faster way to accomplish this ?
Thanks
The text was updated successfully, but these errors were encountered: