C Programming – Understanding Bit-Level Endianness

bitbytecendianness

I am learning the union and struct and I wrote the code below. What I do not understand is why the output is different when I change from a little endian to a big endian machine.

My understanding is that endianness is important when you have more than one byte. But I have only 8 bits.

This code is not portable at this time and I want to understand how to make it portable. I would like to avoid detecting the endianness or using bit shifting techniques. I know about htonX functions but I do not think that applies here.

    #include <stdio.h>

    int main()
    {
      typedef union {
        struct {
          unsigned char b0:1;
          unsigned char b1:1;
          unsigned char b2:1;
          unsigned char b3:1;
          unsigned char b4:1;
          unsigned char b5:1;
          unsigned char b6:1;
          unsigned char b7:1;
        } bits;
        unsigned char byte;
      } HW_Register;

      HW_Register reg;
      reg.byte = 3;

      printf("%d %d %d %d %d %d %d %d\n", 
        reg.bits.b0,
        reg.bits.b1,
        reg.bits.b2,
        reg.bits.b3,
        reg.bits.b4,
        reg.bits.b5,
        reg.bits.b6,
        reg.bits.b7);

      printf("%d\n",reg.byte);
      printf("Size of reg.byte: %d\n", sizeof(reg.byte));
    }

Best Answer

The order in which bit fields are placed in an integer is independent of the order in which bytes are placed in an integer. Both are implementation details. That is generally not a problem, because memory is only byte addressable, and all hardware preserves the value of a byte during transmissions.

Yet, while bit and byte ordering are theoretically independent of each other, CPU documentations tend to number the bits in the same order as they number the bytes: X86 documentation will refer to the least significant bit as bit 0, while Power documentation will refer to the most significant bit as bit 0. Your compiler writers seem to have chosen the same numbering as the respective documentation.

As such, your only hope of achieving portability, is to do the bit fiddling yourself, defining a flags variable and a bunch of constants to set/mask the respective bits:

const unsigned char kBit0 = 1 << 0,
                    kBit1 = 1 << 1,
                    kBit2 = 1 << 2,
                    kBit3 = 1 << 3,
                    kBit4 = 1 << 4,
                    kBit5 = 1 << 5,
                    kBit6 = 1 << 6,
                    kBit7 = 1 << 7,

unsigned char reg = 3;
printf("%d %d %d %d %d %d %d %d\n",
    !!(reg & kBit0),
    !!(reg & kBit1),
    !!(reg & kBit2),
    !!(reg & kBit3),
    !!(reg & kBit4),
    !!(reg & kBit5),
    !!(reg & kBit6),
    !!(reg & kBit7));

This numbers the bits in little endian fashion, irrespective of the machine. If you want big endian numbering, just define your constants accordingly.