-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Once we have a reasonably good hash function
we can start implementing a first simple hash table.
Again the design space is quite big and there are various trade-offs to think about.
Good starting points
- https://en.wikipedia.org/wiki/Hash_table
- https://en.wikipedia.org/wiki/Open_addressing
- https://en.wikipedia.org/wiki/Linear_probing
To stay without our philosophy of simple and practical solutions, I believe we could start with the following design
- Use a single flat array storing (key-value) pairs based on the key's hash interpreted as an index
- If there is a collision, use open addressing and in particular linear probing, which is good on cache hierarchies and simple to implement
- If the load factor is higher than around 0.7-0-8 we resize and rehash
Pros
- simple to implement
- simple to reason about
- good on caches since linear probing simply does a linear scan on collision
Cons
- clustering, worst case linear probing needs to make a full scan, that's why it's important to have a good hash function and keep load low
- probably not the most efficient or best in class
Overall seems like a good pragmatic solution.
Note that we can start e.g. with a uint32 key-value hash table, and if users want to hash their own type it's possible by storing them in a flat array and using the hash table for indirection.
Metadata
Metadata
Assignees
Labels
No labels