Speeding up HeavyHitters add function #99
-
Hi, Thanks for the amazing library. I am trying to speed up the Creating 100_000 items
Using pandas
Vs Using pyprobables
In the add function in the line |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I am glad that you have found this package helpful. The default If you do not want to build your own system, there are additional functions that can be used, As for why the code uses a list of hash results is due to how Count-min Sketches work (which the Heavy Hitter is built upon). It uses a set of hash values (of the same input) to generate a fingerprint of the locations within a I hope this is informative and helpful! |
Beta Was this translation helpful? Give feedback.
I am glad that you have found this package helpful. The default
fnv-1a
hash can be slow due to it being a pure python implementation. There are several things that one can do to speed it up such as using a hashing function implemented in C. For example, you can see the documentation that shows how to easily setup a custom hashing strategy or by utilizing custom hashing functions.If you do not want to build your own system, there are additional functions that can be used,
default_md5
anddefault_sha256
, which are both implemented in C and would be faster than the defaultfnv-1a
code. If you want to use something else, there are decorators that can be used to help ensure that you get the c…