Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Relational databases do mostly range queries. (e.g. SELECT a WHERE b > ?). Hash tables suck for that since in general they reshuffle (hash) keys. In tree structures 1m key range access is 1 random lookup and 1m sequential (cache-friendly). In hash tables that becomes 1m random lookups and not cache friendly.

See section IV 4 E, "range queries" in "A comparison of adaptive radix trees and hash tables" (2015)

https://15721.courses.cs.cmu.edu/spring2019/papers/08-oltpin...

Even for the best hash table-based indexing B+Trees win. And combining this with the big cost of resizing (growing) or rehashing a table (rebalancing, i.e. Cuckoo) they are not useful.

Perhaps some adaptive hashing method could take care of this in the future.



I don't think this is correct (about mostly range queries). Scan the code base of any application, and I predict you will see 90%+ queries not involving an inequality. That is for OLTP. For analytics, the answer might be different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: