Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the hash function #673

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ThibaultDECO
Copy link
Contributor

Using Szudzik's pairing function (more efficient than the previous approach or Cantor's)

Using Szudzik's pairing function (more efficient than the previous approach or Cantor's)
@ThibaultDECO
Copy link
Contributor Author

As discussed in issue #648.

@tfussell
Copy link
Owner

tfussell commented Dec 3, 2022

I didn't know about that algorithm--it's pretty cool! Could you explain how it improves performance though? It seems like the operations are about the same (bit shift ~= multiplication, addition ~= boolean OR).

@tfussell
Copy link
Owner

tfussell commented Dec 3, 2022

If the pairing function is a perfect hash function, maybe we could remove the call to std::hash<size_t> entirely and simply return the 1-d value as the result of the custom hash. I could see that being much faster.

@ThibaultDECO
Copy link
Contributor Author

The Szudzik function has 100% value packing efficiency, i.e. no overflow. With Szudzik, pairing two 32-bits numbers results in a 64-bits number, which is not the case for Cantor or the current implementation. For example:
cantor(9, 9) = 200
szudzik(9, 9) = 99
You will find more details and a benchmark on performance here.
We could indeed remove the call to std::hash<size_t>, but only if size_t is 64-bits, which is unfortunately not always the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants