Blog post about new hash table (WIP) #195

zuiderkwast · 2025-01-13T22:40:25Z

A first stab.

The new new hash table is one of the highlights of the upcoming 8.1 release.

Improve structure and content of the text
Replace ascii art with other art
Decide which benchmarks we want
Add benchmark results

Signed-off-by: Viktor Söderqvist <[email protected]>

SoftlyRaining · 2025-01-16T23:09:28Z

content/blog/2025-03-20-new-hash-table.md

+| Valkey 8.0 | ? bytes                |
+| Valkey 8.1 | ? bytes                |
+
+The benchmarks below were run using a key size of N and a value size of M bytes, without pipelining.


Let's add something for set/zset/hash and see if we get even more performance and memory savings since those datatypes are hashtables inside of a hashtable. :)

Yes... Feel free to replace these tables with some completely different tests.

There's a fixed overhead for the key and then per field-value. Still I'd like to see a table of memory savings per element/field/etc. for these types.

I want to do hash value embedding (to save the value pointer and an extra allocation) and Ran noticed that our embedded sds (key and field) are sds8 even when they should be sds5, so we could save a two more bytes for those. That's because they're copied from an EMBSTR robj value and those are always sds8. I have some idea to fix that too though.

SoftlyRaining · 2025-01-16T23:48:22Z

content/blog/2025-03-20-new-hash-table.md

+Why not use an open-source state-of-the-art hash table implementation such as
+Swiss tables? The answer is that we require some specific features, apart from
+the basic operations like add, lookup, replace, delete:


Another reason: Swiss table is very fast, but it stores the elements directly in a contiguous array, which requires that the elements all be the same size. Because our elements vary in size, we had to choose a different design - we chose cache-line sized buckets with element pointers. (This idea was mentioned at the end of the swiss table talk - up to you if you want to make that reference though.)

No, you can store pointers in a Swiss table, just like how we store pointers in our bucket layout. The pointers are the fixed-size elements, no?

I don't think it allows a custom key-value entry design like we do though. It can be either a set or a map (key and value) IIUC.

I think we could have picked an off-the-shelf implementation even if we couldn't embed key and value the way we do, as long as would be better than dict. It's good to use a battle-tested ready-to-use one too. It's easier to get it right, and less work... I think scan and incremental rehashing were clearly blockers though.

Blog post about new hash table (WIP)

83491b6

Signed-off-by: Viktor Söderqvist <[email protected]>

SoftlyRaining reviewed Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blog post about new hash table (WIP) #195

Blog post about new hash table (WIP) #195

zuiderkwast commented Jan 13, 2025

SoftlyRaining Jan 16, 2025

zuiderkwast Jan 16, 2025

SoftlyRaining Jan 16, 2025

zuiderkwast Jan 17, 2025

zuiderkwast Jan 17, 2025

Blog post about new hash table (WIP) #195

Are you sure you want to change the base?

Blog post about new hash table (WIP) #195

Conversation

zuiderkwast commented Jan 13, 2025

SoftlyRaining Jan 16, 2025

Choose a reason for hiding this comment

zuiderkwast Jan 16, 2025

Choose a reason for hiding this comment

SoftlyRaining Jan 16, 2025

Choose a reason for hiding this comment

zuiderkwast Jan 17, 2025

Choose a reason for hiding this comment

zuiderkwast Jan 17, 2025

Choose a reason for hiding this comment