Java – In Java, why does Integer.MIN_VALUE == -Integer.MIN_VALUE

javalanguage-designmath

Quite simply, I was aghast when I discovered one of my unit tests was failing because -Integer.MIN_VALUE == Integer.MIN_VALUE!

Why would the language designers choose this, what is the mathematical advantage, and what are some alternatives I might consider in the future when dealing with edge cases?

Best Answer

The short answer is that this is how twos-complement negation works. It's not overflow, and it wouldn't be detectable without special circuitry in the processor (or equivalent checks in the language runtime).

How Twos-Complement Arithmetic Works

I'll start with the number line, in binary, limiting my wordsize to 3 bits:

011 =  3
010 =  2
001 =  1
000 =  0
111 = -1
110 = -2
101 = -3
100 = -4

Inside the computer, there is an Adder circuit that can combine two bits and produce a result plus carry bit. These circuits are chained together, so that the processor can add two entire words. The carry bit from this addition is exposed to via a processor status register, but is not normally available to high-level languages.

Some examples:

 2  010      2  010     2   010
 1  001     -1  111     2   010
== ====     == ====    ==  ====
 3  011      1 C001    -4   100

Let's look at those examples individually:

  • 2 + 1 = 3, just like you'd expect. Both the addends and the sum are within the range of positive integers for our word size.
  • 2 - 1 = 1, again just like you'd expect. Internally this operation sets the carry bit, indicating that the addition overflowed the word size. If we were using unsigned numbers, this would be a problem, but with twos-complement numbers it's OK.
  • 2 + 2 = -4, which is definitely not what you'd expect. However, note that the carry bit remains unset. To detect overflow in this case, you'd have to check that two signed inputs resulted in an output with a different sign.

So why isn't overflow checked? The simplest answer is cost.

At the level of the hardware, addition is unsigned (at least on the three processors that I've programmed). Detecting integer overflow would require a separate set of opcodes for signed math, which would mean more transistors, which could be more profitably used elsewhere. In the earlier days of computing that was a huge concern; today, maybe not so much but almost everyone is OK with how math is implemented.

At the level of the language, cost is still a factor. There is the runtime cost of checking every signed operation for overflow, but there is also a programmer cost: imagine having to wrap all expressions (even a for loop) with a try/catch. The .Net runtime apparently gives you the option of enabling this, while Java explicitly does not.

How Twos-Complement Negation Works, and why -MIN_VALUE equals itself

In prose: twos-complement negation flips all of the bits in a number and then adds one.

I use a prose definition because that's almost certainly how it actually works in the hardware (although I'm not a hardware engineer, so can't say for sure, plus different architectures might use different techniques).

Let's see what happens with our 3-bit words:

100 = MIN_VALUE
011 = all bits flipped
100 = after adding 1

Note that there's no carry involved, although you could check for sign of value and result. However, that again would require special circuitry and/or runtime-level checks, to catch a result that will happen almost never.

What Are Some Alternatives, and Why Aren't They Used

One alternative is ones-complement, in which negation is simply inverting all bits. The ones-complement number line for a 3-bit word looks like this:

011 =  3
010 =  2
001 =  1
000 =  0
111 = -0
110 = -1
101 = -2
100 = -3

According to the linked Wikipedia article, there were machines using ones-complement arithmetic; I never used one. Again, I'm not a hardware engineer, but I believe that you need separate operations for addition and subtraction with ones-complement (in addition to separate operations for unsigned math), which again runs into the problem of cost.

The Wikipedia article mentions the problem of "end-around borrow," which may have been an issue with the actual computers that used ones-complement math, but I don't think is a necessary problem. I believe that the carry bit could also serve as a borrow bit.

The bigger problem is that you have two values for zero. Which is going to cause programmers to create a lot of off-by-one errors when counting, or is going to require a lot of special-case code in the language runtime (eg: a for loop that knows when it crosses 0 that it has to skip to 1/-1).

Another alternative is to use the high-order bit just as a sign bit, with the low-order bits being the same between positive and negative:

011 =  3
010 =  2
001 =  1
000 =  0
100 = -0
101 = -1
110 = -2
111 = -3

This is how IEEE-754 floating point works. It makes sense when your primary operations are assumed to be multiplication and division, not so much for addition and subtraction. And it still has the issue of two zeros.

Commentary

To me, this question is identical to questions that express outrage over the fact that 0.10 cannot be represented by a floating point number: both indicate a belief that digital computers should be able to exactly represent the real world. Or, in other words, that computers operate according to the laws of mathematics.

I can understand this belief; what I can't understand is the outrage that people express when the belief is shown to be false. A few moment's reflection should make it apparent that the belief cannot be true: computers work with finite quantities, whereas mathematics deals with continuous relations (I was about to say that everything in the real world is continuous, but figured that someone would bring up quantum mechanics).

Faced with this fundamental truth, computer designers -- and language designers, and application programmers -- have to make trade-offs. You might not like the particular tradeoff, but you should seek to understand it rather than simply complain about it. And once you understand the tradeoff, you can look for an environment that made a different tradeoff.