Floating-Point – How to Calculate Min/Max Values of Floating Point Numbers

floating point

I'm trying to calculate the min/max, or the lowest to highest value range of a 48 bit floating point type MIL-STD-1750A (PDF) (WIKI).

Ex: How a double range is 1.7E +/- 308

I've looked around for equations, and am unsure if what I have found will work.

The first equation I found was first equation

The second was second equation

I'm not quite sure where to begin with these, if they are even correct in what I need.

Will someone impart their knowledge to me and help solve this?

Best Answer

For 32-bit floating point, the maximum value is shown in Table III:

0.9999998 x 2^127 represented in hex as: mantissa=7FFFFF, exponent=7F.

We can decompose the mantissa/exponent into a (close) decimal value as follows:

7FFFFF <base-16> = 8,388,607 <base-10>. 

There are 23 bits of significance, so we divide 8,388,607 by 2^23.

8,388,607 / 2^23 = 0.99999988079071044921875 (see Table III)

as far as the exponent:

7F <base-16> = 127 <base-10>

and now we multiply the mantissa by 2^127 (the exponent)

8,388,607 / 2^23 * 2^127 = 
8,388,607 * 2^104 = 1.7014116317805962808001687976863 * 10^38

This is the largest 32-bit floating point value because the largest mantissa is used and the largest exponent.

The 48-bit floating point adds 16 bits of lessor significance mantissa but leaves the exponent the same size. Thus, the max value would be represented in hex as

mansissa=7FFFFFFFFF, exponent=7F.

again, we can compute

7FFFFFFFFF <base-16> = 549,755,813,887 <base-10> 

the max exponent is still 127, but we need to divide by [23+16=39, so:] 2^39. 127-39=88, so just multiply by 2^88:

549,755,813,887 * 2^88 =
1.7014118346015974672186595864716 * 10^38

This is the largest 48-bit floating point value because we used the largest possible mantissa and largest possible exponent.

So, the max values are:

1.7014116317805962808001687976863 * 10^38, for 32-bit, and,
1.7014118346015974672186595864716 * 10^38, for 48-bit

The max value for 48-bit is just slightly larger than for 32-bit, which stands to reason since a few bits are added to the end of the mantissa.

(To be exact the maximum number for the 48-bit format can be expressed as a binary number that consists of 39 1's followed by 88 0's.)

(The smallest is just the negative of this value. The closest to zero without being zero can also easily be computed as per above: use the smallest possible (positive) mantissa:0000001 and the smallest possible exponent: 80 in hex, or -128 in decimal)


FYI

Some floating point formats use an unrepresented hidden 1 bit in the mantissa (this allows for one extra bit of precision in the mantissa, as follows: the first binary digit of all numbers (except 0, or denormals, see below) is a 1, therefore we don't have to store that 1, and we have an extra bit of precision). This particular format doesn't seem to do this.

Other floating point formats allow denormalized mantissa, which allows representing (positive) numbers smaller than smallest the exponent, by trading bits of precision for additional (negative) powers of 2. This easy to support if it doesn't also support the hidden one bit, a bit harder if it does.


8,388,607 / 2^23 is the value you'd get with mantissa=0x7FFFFF and exponent=0x00. It is not the single bit value but rather the value with a full mantissa and a neutral, or more specifically, a zero exponent.

The reason this value is not directly 8388607, and requires division (by 2^23 and hence is less than what you might expect) is that the implied radix point is in front of the mantissa, rather than after it. So, think +/-.111111111111111111111 (a sign bit, followed by a radix point, followed by twenty-three 1-bits) for the mantissa and +/-111111111111 (no radix point here, just an integer, in this case, 127) for the exponent.

mantissa = 0x7FFFFF with exponent = 0x7F is the largest value which corresponds to 8388607 * 2 ^ 104, where the 104 comes from 127-23: again, subtracting 23 powers of two because the mantissa has the radix point at the beginning. If the radix point were at the end, then the largest value (0x7FFFFF,0x7F) would indeed be 8,388,607 * 2 ^ 127.

Among others, there are possible ways we can consider a single bit value for the mantissa. One is mantissa=0x400000, and the other is mantissa=0x000001. without considering the radix point or the exponent, the former is 4,194,304, and the latter is 1. With a zero exponent and considering the radix point, the former is 0.5 (decimal) and the latter is 0.00000011920928955078125. With a maximum (or minimum) exponent, we can compute max and min single bit values.

(Note that the latter format where the mantissa has leading zeros would be considered denormalized in some number formats, and its normalized representation would be 0x400000 with an exponent of -23).