Electrical – The decimal value 0.5 in IEEE single precision floating point representation

floating point

If I hadn't studied floating point from here my answer would be fraction bits of 000…000 and exponent value of −1 as shown below : enter image description here But in the link I attached above, they mention Denormalized value in which case my answer for this question would be fraction bits of 100…000 and exponent value of 0. I know I'm mixing something up, and the second answer is probably wrong. Can somebody please clarify why it can't be denormalised value?

Best Answer

Start with the number you have as \$v\$ and the number of mantissa bits available in the format as \$m=23\$ (for this format.) Assuming that \$v\ne 0\$, apply the following logic:

  1. Set up a new variable as \$p=0\$.
  2. if \$v\$ is positive, set \$s=0\$ else set \$v=\:\mid v \:\mid\$ and \$s=1\$.
  3. while \$v\lt 2^m\$, set \$p=p-1\$ and \$v=2\cdot v\$.
  4. while \$v\ge 2^{m+1}\$, set \$p=p+1\$ and \$v=\frac{v}{2}\$.

At this point, you have a value \$v : 2^m\le v\lt 2^{m+1}\$. The magnitude of your original number is now represented as \$v\cdot 2^p\$. But rounding hasn't yet occurred.

To round the value so that it fits within the IEEE 32-bit format, do the following step:

  1. if \$\left(v-\lfloor v\rfloor\right)\ge \frac{1}{2}\$, set \$v=\lfloor v\rfloor + 1\$ else set \$v=\lfloor v\rfloor\$.

You now have the sign field represented as \$\left(s\right)\$, the mantissa field represented by \$\left(v-2^m\right)\$ (hidden bit notation), and the exponent represented as \$\left(m+p+127\right)\$.

At this point, there is a final step about finding out if the exponent is representable or instead out of range. If \$0\lt \left(m+p+127\right)\lt 255\$ (for single precision), then you are fine. But if that isn't true, then in the case where \$\left(m+p+127\right)\le 0\$ you may be able to consider adapting it to a denormal format where you can preserve some, but not all, of the higher order mantissa precision bits. (A denormal sets the exponent field arbitrarily to zero.) In the other case where \$\left(m+p+127\right)\ge 255\$, you cannot represent the number in the format and you need to select an error format code, instead. These special values all set the exponent field to 255 and encode special, added meaning inside the mantissa field.


One thing that many fail to understand is the concept of the hidden bit notation. It's not complicated, but it takes a moment to consider.

Since the mantissa is normalized before packing, it's always the case that the upper-most bit is a 1 (unless the value was 0, of course.) So it's a waste of space to include it. As a result, the upper-most bit is removed (hidden) and only the remaining bits are packed into the mantissa. (It is also restored when unpacking the floating point format, too.) You can see the fact that I hide it in the above discussion where I wrote \$\left(v-2^m\right)\$: the \$-2^m\$ term is where I'm removing the hidden/upper-most bit of the mantissa.

In the case of denormals, the hidden bit isn't hidden but is instead included into the mantissa field since the exponent (excess 127 format) is always zero in this case and it isn't otherwise possible to "restore" a hidden bit since there is no information about where to put it in the case of a denormal.