Storage Schema – Representing Integer Numbers from 0 to Infinity

numbersstorageuuid

I would like a schema to represent integer numbers starting with 0, without any limit (assuming access to infinite linear storage).

Here's a schema that can represent numbers from 0 to 255:

Use the first byte of the storage (address 0) to store the integer.

Now, suppose I want to represent numbers larger than 255. Of course, I could use more than 1 byte to represent the integer, but as long as it's a fixed number, there will be eventually an integer so large that it cannot be represented by the original schema.

Here's another schema that should be able to do the task, but it's probably far from efficient.

Just use some sort of unique "end of number" byte, and use all the previous bytes to represent the number. Obviously, this "end of number" byte cannot be used anywhere in the number representation, but this can be achieved by using a base-255 (instead of base-256) numbering system.

However, that's slow and probably inefficient. I want to have a better one that performs better with low values and scales well.

Essentially, it's a UUID system. I want to see if it's possible to create a fast-performing UUID system that can theoretically scale to use for years, thousands of years, millions of years, without having to be redesigned.

Best Answer

An approach I've used: count the number of leading 1 bits, say n. The size of the number is then 2^n bytes (including the leading 1 bits). Take the bits after the first 0 bit as an integer, and add the maximum value (plus one) that can be represented by a number using this encoding in 2^(n-1) bytes.

Thus,

                  0 = 0b00000000
                   ...
                127 = 0b01111111
                128 = 0b1000000000000000
                   ...
              16511 = 0b1011111111111111
              16512 = 0b11000000000000000000000000000000
                   ...
          536887423 = 0b11011111111111111111111111111111
          536887424 = 0b1110000000000000000000000000000000000000000000000000000000000000
                   ...
1152921505143734399 = 0b1110111111111111111111111111111111111111111111111111111111111111
1152921505143734400 = 0b111100000000000000000000000000000000000000000000 ...

This scheme allows any non-negative value to be represented in exactly one way.

(Equivalently, used the number of leading 0 bits.)

Related Topic