This is quite cool. They shattered a 40 year-old conjecture on map-insertion speed boundaries. To understand the practical impact I have to read the paper, but given their abundance in CS the potential is huge.
All we have to do is rename b-trees to “hash tables” and database lookup times will be changed forever.
“Data structures called hash tables”? Editor thinks this is some arcane little-used data technology.
Hashes are used in literally every programming language that’s worth using.
What are you on about? The first paragraph says hash tables are “widely used” and the second says they are “common”.
“Data structures called hash tables” seems to just be a factual statement of what we’re dealing with, especially for people who are not programmers.
It just feels bizarre. You wouldn’t say something like “a car part called a catalytic converter”.
A catalytic converter also might be part of a wood stove. A lay-person may have no idea what a hash table is, or if they are also used in fields other than computer science.
Why would you specify that it’s a turbo encabulator, everyone knows they’re all turbo.
It feels just as odd to say “a component of fuel consuming engines called a catalytic converter”. You’re missing the point.
Some of my best, most useful programs sort data from disparate sources into enormous Hash-Of-Hash structures to produce extremely insightful reports. And I wrote the first version 25 years ago.
Oh yes, the new revolutionary data structure called a hash table
Did you read the article? The claim is that they have invented a new kind of hash table that has vastly improved algorithmic complexity compared to standard hash tables.
I haven’t read the paper yet, but if what the article claims is true, it could be revolutionary in computer science and open up a ton of doors.
I’m very skeptical
You can read the full paper yourself here: https://arxiv.org/pdf/2501.02305.
I haven’t had time to fully read it yet, but glancing through, it looks pretty legit.
This is a graduate computer science student working with accomplished CS faculty at Rutgers and Carnegie Mellon, we aren’t talking about some rando making outlandish claims.
The thing about theoretical computer science is that, like math, it isn’t subject to the pitfalls of empirical science. It isn’t dependent on reproduction. The proof is provided in the paper, so either it indeed proves what it claims to, or the proof is erroneous, which can readily be refuted.
I need to look more into this, I would’ve thought query time on hash tables was already constant.
Only if there’s no collisions. With lots of collisions it’s far from constant.
This is very cool news. Would be nice if it had some details on the implementation though
Time to read the paper I guess
Agreed, I figured they’d have at least some psuedocode but alas
For those in a rush:
Initial paper outlining theorem (2021):
https://arxiv.org/pdf/2111.12800
Paper that demonstrates and proves its validity (2025):
https://arxiv.org/pdf/2501.02305
I tried a quick search, but I’m not seeing any public implementations that specifically mention or cite ‘Krapavin’ or ‘Tiny Pointers’ anywhere.