-
Notifications
You must be signed in to change notification settings - Fork 628
Swap fxhash for foldhash due to hash quality issues #3111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
arqunis
approved these changes
Feb 13, 2025
arqunis
pushed a commit
to arqunis/serenity
that referenced
this pull request
Feb 14, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Mar 5, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Mar 5, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Mar 7, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Mar 10, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Mar 11, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
GnomedDev
pushed a commit
to GnomedDev/serenity
that referenced
this pull request
Mar 26, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
GnomedDev
pushed a commit
to GnomedDev/serenity
that referenced
this pull request
Mar 26, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
GnomedDev
pushed a commit
that referenced
this pull request
Apr 28, 2025
As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
GnomedDev
pushed a commit
that referenced
this pull request
May 19, 2025
As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Jun 30, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Jun 30, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Jun 30, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Jul 28, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Jul 28, 2025
…3111) As the title notes, this commit replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Oct 7, 2025
…-rs#3111) As the title notes, this commit replaces rustc-hash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because rustc-hash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using rustc-hash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Oct 7, 2025
…-rs#3111) As the title notes, this commit replaces rustc-hash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because rustc-hash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using rustc-hash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
mkrasnitski
pushed a commit
to mkrasnitski/serenity
that referenced
this pull request
Oct 7, 2025
…-rs#3111) As the title notes, this commit replaces rustc-hash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since `hashbrown` and by extension `std` use various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits. This, however, presents problems, because rustc-hash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then `ahash` matured and we've had significant research and development in "good enough" hashing for datastructures with short keys, [the most recent step forward coming from a rather well known face][foldhash]. This improves shard selection quite a bit and reduces contention significantly. Using rustc-hash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro. [foldhash]: https://github.com/orlp/foldhash
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cache
Related to the `cache`-feature.
dependencies
Related to Serenity dependencies.
enhancement
An improvement to Serenity.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As the title notes, this PR replaces fxhash for foldhash as used in the cache. dashmap, due to it's sharding, has to share entropy with what's handed down to internal maps. Since
hashbrownand by extensionstduse various sections of the high bit range for special grouping & sorting, dashmap is left with the only option to shard on low bits.This however presents problems, because fxhash outputs hashes of very bad quality, with only the high bits having any real entropy. This was probably a solid choice back in 2018 when we lacked other good fast alternatives. But since then
ahashmatured and we've had significant research and development in "good enough" hashing for datastructures with short keys, the most recent step forward coming from a rather well known face. This improves shard selection quite a bit and reduces contention significantly. Using fxhash in a dashmap specific benchmark causes contention to go up by 3-8x when keys are k-sortable with time (Discord snowflakes) on an M1 Pro.