Design strategies for storing and validating serial numbers

datadesignvalidation

We are writing software to track Foo Widgets. Each Foo Widget has a serial number. The serial number is an 32-character alphanumeric string. The string is separated into five sets. Each set is separated by a dash (so the s/n is 32-characters NOT including dashes).

So for example: 11111111-1111-1111-1111-111111111111. But this may change, since our software isn't actually creating the serial numbers.

I'd like to learn about different strategies for storing the serial number and doing user validation of the serial number in the UI.

To start with, I'd like get talk about strategies for storing the the serial number in our system. This issue came up at Foo Widgets Incorporated, and there was a disagreement about whether we should store each serial numbers with or without the dashes.

I think the most flexible way of doing this (but maybe not most simple) would be to store the serial number without the dashes, store the schema of the serial number (as a regular expression), and then create an identifier that is used to track which schema is used (so later if the manufacturer changes it we can support that and perhaps both schemas at the same time).

The counter argument to this was that the dashes "are a part of the data" and that it's "not like a phone number". I'm having some trouble understanding this point of view.

Best Answer

For the user validation part of your question, the guiding principle is Postel's law: be conservative in what you send, liberal in what you accept. That means you should always display the serial number with the dashes, but assuming you can correctly reconstruct the serial number without them, allow the users to enter serial numbers either with or without the dashes.

As for storage, it really depends on if you frequently need to do queries on subfields of the number. If you don't, it's most robust to store it in the preferred format for reading it, that is, with the dashes. That way a developer or database admin running a query from a non user-facing interface like a CLI won't have to parse it out in his head.

Related Topic