Are the results of bitwise operations on signed integers defined

bitwise-operatorsc

I know that the behavior of >> on signed integer can be implementation dependent (specifically, when the left operand is negative).

What about the others: ~, >>, &, ^, |?
When their operands are signed integers of built-in type (short, int, long, long long), are the results guaranteed to be the same (in terms of bit content) as if their type is unsigned?

Best Answer

For negative operands, << has undefined behavior and the result of >> is implementation-defined (usually as "arithmetic" right shift). << and >> are conceptually not bitwise operators. They're arithmetic operators equivalent to multiplication or division by the appropriate power of two for the operands on which they're well-defined.

As for the genuine bitwise operators ^, ~, |, and &, they operate on the bit representation of the value in the (possibly promoted) type of the operand. Their results are well-defined for each possible choice of signed representation (twos complement, ones complement, or sign-magnitude) but in the latter two cases it's possible that the result will be a trap representation if the implementation treats the "negative zero" representation as a trap. Personally, I almost always use unsigned expressions with bitwise operators so that the result is 100% well-defined in terms of values rather than representations.

Finally, note that this answer as written may only apply to C. C and C++ are very different languages and while I don't know C++ well, I understand it may differ in some of these areas from C...

Related Solutions

Padding bits in unsigned integers and bitwise operations in C89

Bitwise operations (like arithmetic operations) operate on values and ignore padding. The implementation may or may not modify padding bits (or use them internally, eg as parity bits), but portable C code will never be able to detect this. Any value (including UINT_MAX) will not include the padding.

Where integer padding might lead to problems on is if you use things like sizeof (int) * CHAR_BIT and then try to use shifts to access all these bits. If you want to be portable, either only use (unsigned) char, fixed-sized integers (a C99 addition) or determine the number of value-bits programatically. This can be done at compile-time with the preprocessor by comparing UINT_MAX against powers of 2 or at runtime by using bit-operations.

edit:

C90 does not mention integer padding at all, but as far as I can tell, 'invisible' preceding or trailing integer padding bits shouldn't violate the standard (I didn't go through all relevant sections to make sure this is really the case, though); there probaby are problems with mixed padding and value bits as mentioned in the C99 rationale because otherwise, the standard would not have needed to be changed.

As to the meaning of user-accessible: Padding bits are accessible insofar as you can alwaye get at any bit of foo (including padding) by using bit-operations on ((unsigned char *)&foo)[…]. Be careful when modifying the padding bits, though: the result won't change the value of the integer, but might create be a trap-representation nevertheless. In case of C90, this is implicitly unspecified (as in not mentioned at all), in case of C99, it's implementation-defined.

This was not what the rationale quotation was about, though: the cited architecture represents 32-bit integers via two 16-bit integers. In case of unsigned types, the resulting integer has 32 value bits and a precision of 32; in case of signed integers, it only has 31 value bits and a precision of 30: one of the sign bits of the 16-bit integers is used as the sign bit of the 32-bit integer, the other one is ignored, thus creating a padding bit surrounded by value bits. Now, if you access a 32-bit signed integer as an unsigned integer (which is explicitly allowed and does not violate the C99 aliasing rules), the padding bit becomes a (user-accessible) value bit.

Is unsigned integer subtraction defined behavior

When you work with unsigned types, modular arithmetic (also known as "wrap around" behavior) is taking place. To understand this modular arithmetic, just have a look at these clocks:

enter image description here

9 + 4 = 1 (13 mod 12), so to the other direction it is: 1 - 4 = 9 (-3 mod 12). The same principle is applied while working with unsigned types. If the result type is unsigned, then modular arithmetic takes place.

Now look at the following operations storing the result as an unsigned int:

unsigned int five = 5, seven = 7;
unsigned int a = five - seven;      // a = (-2 % 2^32) = 4294967294 

int one = 1, six = 6;
unsigned int b = one - six;         // b = (-5 % 2^32) = 4294967291

When you want to make sure that the result is signed, then stored it into signed variable or cast it to signed. When you want to get the difference between numbers and make sure that the modular arithmetic will not be applied, then you should consider using abs() function defined in stdlib.h:

int c = five - seven;       // c = -2
int d = abs(five - seven);  // d =  2

Be very careful, especially while writing conditions, because:

if (abs(five - seven) < seven)  // = if (2 < 7)
    // ...

if (five - seven < -1)          // = if (-2 < -1)
    // ...

if (one - six < 1)              // = if (-5 < 1)
    // ...

if ((int)(five - seven) < 1)    // = if (-2 < 1)
    // ...

but

if (five - seven < 1)   // = if ((unsigned int)-2 < 1) = if (4294967294 < 1)
    // ...

if (one - six < five)   // = if ((unsigned int)-5 < 5) = if (4294967291 < 5)
    // ...

Best Answer

Related Solutions

Padding bits in unsigned integers and bitwise operations in C89

Is unsigned integer subtraction defined behavior

Related Topic