When I need to generate unique id\'s based on some information hashing is typica
ID: 650874 • Letter: W
Question
When I need to generate unique id's based on some information hashing is typical choice. However, sometimes that id needs to be of a particular size. I've seen a lot of schemes (HMAC-MD5-96 in SSH, CGA in SeND for IPv6) that use a portion of a hash, so I'm thinking it might be alright to use it that way.
Which properties of a hash are preserved and which do not apply anymore?
Obviously, with less bits, the chance of collision goes up, but can I still rely on the partial hash for uniform character distribution? What about the avalanche property? I'm guessing that if a hash modifies the entire hash upon the smallest change to the input, that would imply that a portion of that hash would be equally well changed. Am I thinking about this correctly?
Also, would these behaviors be different for different hashing algorithms?
Explanation / Answer
Yes, it is just fine to do what you describe. There are other ways to do it (e.g., using the digest of the object as a seed to a random number generator).
A good hash function should be designed such that even if you know the first n-1 bits, you should not be able to predict, with better than a probability of 0.5, the last bit. This is due to the avalanche effect. Another way of saything this is that no portion of the output of the hash function should be "better" than any other portion. I.e., using the even bits has the same strength as using the odd bits, which has the same strength as using the n/2 least significant bits, which is the same strength as the n/2 most significant bits.
Now, as you point out, the chance of collision goes up, but if you are okay with that, then the method you are describing is just fine. Otherwise, there would be a serious problem with the hash function.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.