Design – Is using hashes for primary keys a good idea

database-designdesignprogramming practices

The Austrian electronic ID card relies on the so-called sector identifiers. For example a hospital gets to identify a person by getting a sectorId for that person, which is computed roughly as follows:

sha1(personalId + "+" + prefix + sectorId); // prefix is constant and irrelevant

Is that a good idea? I think the possibility of collision, no matter how small, poses a risk.

In hashtables, when there's a collision, you have other means of establishing equality, but with primary keys you can't possibly have two that are identical. That can be circumvented by a composite key, but then the point of a unique sector identifier is lost.

Is it ok to do that and is there a good way to have it that way without it breaking at some point?

Best Answer

This former SO article tells you how to calculate the collision probability. For SHA-1, b is 160. The number of people living in austria is below 10 millions. Even if each living person in austria is registered in a hospital with a unique person/sector ID, that just makes a collision probability of less than 3.5 x 10^-35. I guess that should be small enough for most practical purposes.

Related Topic