-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bucket size power of 2 limitation #7
Comments
Hi voutcn, We use powers of 2 to size the table since it's a fairly standard way to grow vectors and other resizable arrays, and because a power-of-two sized table has some convenient properties for computing hashes in our model. I'm not really sure resizing by a smaller factor would fix your problem, because say we sized by 1.5x. Then if the number of items happened to be slightly larger than a power of 1.5, the load factor would still only be ~50%. Do you have a specific use case where you need to have a large table with slightly more elements than a power of 2? Also remember that it is the bucket size that is a power of 2, not the number of elements. By default, each bucket has 8 items, so we can actually store 2^n * 8 values for each power-of-two sizing of the table. |
I've actually been working on an implementation of fractional resizing for The trade off is that it will be slower for insertion and access, but it On Wed, Mar 18, 2015, 12:04 Manu Goyal [email protected] wrote:
|
That's interesting. Is the slowdown related to the hashing and bucket -Manu
|
Hi, Dinghua, Growing the table size by a smaller factor (rather than 2x everytime ) is Best,
On Wed, Mar 18, 2015 at 9:04 AM, Manu Goyal [email protected]
Computer Science Department |
Yeah - we have to move to mod instead of mask to compute the indexes. I use a fast computation of the mod for a fixed #, but it's still slower On Wed, Mar 18, 2015 at 2:41 PM Bin Fan [email protected] wrote:
|
Thank you all. @manugoyal Theoretically resizing by say 1.5x should increase the lower bound the load factor. If the number of elements happen to be slightly larger than power of 1.5, the load factor will ~1/1.5 = 2/3. |
Following up on this - sorry for the very long delay. I now know how to do this in a quite reasonably efficient way, but it's going to be a somewhat complex implementation path to get this version of it to handle that cleanly. Will start a design doc... |
Hi David, may I ask is there any update? |
Hi all,
If I am correct, the rehashing alway double the bucket size. In practice, if the number of items is slightly larger than power of two, a load factor will be only ~50%, which is a waste of memory. I would like to ask if it is possible that the rehashing can boost the bucket size by a smaller factor, say 1.5 or 1.3, which is more memory-friendly. Thanks.
The text was updated successfully, but these errors were encountered: