Complexity – Understanding P vs NP vs NP Complete vs NP Hard

big ocomplexitydefinitionnp-complete

I am trying to understand these classifications and why they exist. Is my understanding right? If not, what?

  1. P is polynomial complexity, or O(nk) for some non-negative real number k, such as O(1), O(n1/2), O(n2), O(n3), etc. If a problem belongs to P, then there exists at least one algorithm that can solve it from scratch in polynomial time. For example I can always figure out if some integer n is prime by looping over 2 <= k <= sqrt(n) and checking at each step if k divides n.

  2. NP is non-deterministic polynomial complexity. I don't really know what it means for it to be non-deterministic. I think it means it is easy to verify in polynomial time, but may or may not be polynomial time to solve from scratch if we didn't already know the answer. Since it may be solvable in polynomial time, all P problems are also NP problems. Integer factorization gets quoted as an example of NP, but I don't understand why it's not P, personally, since trial factorization takes O(sqrt(n)) time.

  3. NP-Complete I don't understand at all, but the Traveling Salesman Problem is quoted as an example of this. But in my opinion the TSP problem might just be NP, because it takes something like O(2n n2) time to solve, but O(n) to verify if you are given the path up front.

  4. NP-Hard I assume is just full of unknowns. Hard to verify, hard to solve.

Best Answer

You're basically correct about P and NP, but not about NP-hard and NP-complete.

For starters, here are the super-concise definitions of the four complexity classes in question:

  • P is the class of decision problems which can be solved in polynomial time by a deterministic Turing machine.

  • NP is the class of decision problems which can be solved in polynomial time by a non-deterministic Turing machine. Equivalently, it is the class of problems which can be verified in polynomial time by a deterministic Turing machine.

  • NP-hard is the class of decision problems to which all problems in NP can be reduced to in polynomial time by a deterministic Turing machine.

  • NP-complete is the intersection of NP-hard and NP. Equivalently, NP-complete is the class of decision problems in NP to which all other problems in NP can be reduced to in polynomial time by a deterministic Turing machine.

And here's a Euler diagram from Wikipedia showing the relationships between these four classes (assuming that P is not equal to NP):

enter image description here

The part that I assume you're most unfamiliar with or confused by is the notion of a "polynomial time reduction" from problem X to problem Y. A reduction from X to Y is simply an algorithm A which solves X by making use of some other algorithm B which solves problem Y. This reduction is called a "polynomial time reduction" if all parts of A other than B have a polynomial time complexity. As a trivial example, the problem of finding the smallest element in an array is constant-time reducible to the sorting problem, since you can sort the array and then return the first element of the sorted array.

One thing that's easy to miss about the NP-hard definition is that the reduction goes from NP problems to the NP-hard problem, but not necessarily vice versa. This means that NP-hard problems might be in NP, or in a much higher complexity class (as you can see from the Euler diagram), or they might not even be decidable problems. That's why people often say something like "NP-hard means at least as hard as NP" when trying to explain this stuff informally.

The halting problem is a good example of an NP-hard problem that's clearly not in NP, as Wikipedia explains:

It is easy to prove that the halting problem is NP-hard but not NP-complete. For example, the Boolean satisfiability problem can be reduced to the halting problem by transforming it to the description of a Turing machine that tries all truth value assignments and when it finds one that satisfies the formula it halts and otherwise it goes into an infinite loop. It is also easy to see that the halting problem is not in NP since all problems in NP are decidable in a finite number of operations, while the halting problem, in general, is undecidable.

Related Topic