C++ – Missing number(s) Interview Question Redux

ccomplexity-theorycomputer sciencemath

The common interview problem of determining the missing value in a range from 1 to N has been done a thousand times over. Variations include 2 missing values up to K missing values.

Example problem: Range [1,10] (1 2 4 5 7 8 9 10) = {3,6}

Here is an example of the various solutions:

Easy interview question got harder: given numbers 1..100, find the missing number(s)

My question is that seeing as the simple case of one missing value is of O(n) complexity and that the complexity of the larger cases converge at roughly something larger than O(nlogn):

Couldn't it just be easier to answer the question by saying sort (mergesort) the range and iterate over it observing the missing elements?

This solution should take no more than O(nlogn) and is capable of solving the problem for ranges other than 1-to-N such as 10-to-1000 or -100 to +100 etc…

Is there any reason to believe that the given solutions in the above SO link will be better than the sorting based solution for larger number of missing values?

Note: It seems a lot of the common solutions to this problem, assume an only number theoretic approach. If one is being asked such a question in an S/E interview wouldn't it be prudent to use a more computer science/algorithmic approach, assuming the approach is on par with the number theoretic solution's complexity…

Best Answer

You are only specifying the time complexity, but the space complexity is also important to consider.

The problem complexity can be specified in term of N (the length of the range) and K (the number of missing elements).

In the question you link, the solution of using equations is O(K) in space (or perhaps a bit more ?), as you need one equation per unknown value.

There is also the preservation point: may you alter the list of known elements ? In a number of cases this is undesirable, in which case any solution involving reordering the elements, or consuming them, must first make a copy, O(N-K) in space.

I cannot see faster than a linear solution: you need to read all known elements (N-K) and output all unknown elements (K). Therefore you cannot get better than O(N) in time.

Let us break down the solutions

Destroying, O(N) space, O(N log N) time: in-place sort
Preserving, O(K) space ?, O(N log N) time: equation system
Preserving, O(N) space, O(N) time: counting sort

Personally, though I find the equation system solution clever, I would probably use either of the sorting solutions. Let's face it: they are much simpler to code, especially the counting sort one!

And as far as time goes, in a real execution, I think the "counting sort" would beat all other solutions hands down.

Note: the counting sort does not require the range to be [0, X), any range will do, as any finite range can be transposed to the [0, X) form by a simple translation.

EDIT:

Changed the sort to O(N), one needs to have all the elements available to sort them.

Having had some time to think about the problem, I also have another solution to propose. As noted, when N grows (dramatically) the space required might explode. However, if K is small, then we could change our representation of the list, using intervals:

{4, 5, 3, 1, 7}

can be represented as

[1,1] U [3,5] U [7,7]

In the average case, maintaining a sorted list of intervals is much less costly than maintaining a sorted list of elements, and it's as easy to deduce the missing numbers too.

The time complexity is easy: O(N log N), after all it's basically an insertion sort.

Of course what's really interesting is that there is no need to actually store the list, thus you can feed it with a stream to the algorithm.

On the other hand, I have quite a hard time figuring out the average space complexity. The "final" space occupied is O(K) (at most K+1 intervals), but during the construction there will be much more missing intervals as we introduce the elements in no particular order.

The worst case is easy enough: N/2 intervals (think odd vs even numbers). I cannot however figure out the average case though. My gut feeling is telling me it should be better than O(N), but I am not that trusting.

Related Solutions

Javascript – JavaScript’s highest integer value that a number can go to without losing precision

JavaScript has two number types: Number and BigInt.

The most frequently-used number type, Number, is a 64-bit floating point IEEE 754 number.

The largest exact integral value of this type is Number.MAX_SAFE_INTEGER, which is:

2⁵³-1, or
+/- 9,007,199,254,740,991, or
nine quadrillion seven trillion one hundred ninety-nine billion two hundred fifty-four million seven hundred forty thousand nine hundred ninety-one

To put this in perspective: one quadrillion bytes is a petabyte (or one thousand terabytes).

"Safe" in this context refers to the ability to represent integers exactly and to correctly compare them.

From the spec:

Note that all the positive and negative integers whose magnitude is no greater than 2⁵³ are representable in the Number type (indeed, the integer 0 has two representations, +0 and -0).

To safely use integers larger than this, you need to use BigInt, which has no upper bound.

Note that the bitwise operators and shift operators operate on 32-bit integers, so in that case, the max safe integer is 2³¹-1, or 2,147,483,647.

const log = console.log
var x = 9007199254740992
var y = -x
log(x == x + 1) // true !
log(y == y - 1) // also true !

// Arithmetic operators work, but bitwise/shifts only operate on int32:
log(x / 2)      // 4503599627370496
log(x >> 1)     // 0
log(x | 1)      // 1

Technical note on the subject of the number 9,007,199,254,740,992: There is an exact IEEE-754 representation of this value, and you can assign and read this value from a variable, so for very carefully chosen applications in the domain of integers less than or equal to this value, you could treat this as a maximum value.

In the general case, you must treat this IEEE-754 value as inexact, because it is ambiguous whether it is encoding the logical value 9,007,199,254,740,992 or 9,007,199,254,740,993.

C# – How to check if a number is a power of 2

There's a simple trick for this problem:

bool IsPowerOfTwo(ulong x)
{
    return (x & (x - 1)) == 0;
}

Note, this function will report true for 0, which is not a power of 2. If you want to exclude that, here's how:

bool IsPowerOfTwo(ulong x)
{
    return (x != 0) && ((x & (x - 1)) == 0);
}

Explanation

First and foremost the bitwise binary & operator from MSDN definition:

Binary & operators are predefined for the integral types and bool. For integral types, & computes the logical bitwise AND of its operands. For bool operands, & computes the logical AND of its operands; that is, the result is true if and only if both its operands are true.

Now let's take a look at how this all plays out:

The function returns boolean (true / false) and accepts one incoming parameter of type unsigned long (x, in this case). Let us for the sake of simplicity assume that someone has passed the value 4 and called the function like so:

bool b = IsPowerOfTwo(4)

Now we replace each occurrence of x with 4:

return (4 != 0) && ((4 & (4-1)) == 0);

Well we already know that 4 != 0 evals to true, so far so good. But what about:

((4 & (4-1)) == 0)

This translates to this of course:

((4 & 3) == 0)

But what exactly is 4&3?

The binary representation of 4 is 100 and the binary representation of 3 is 011 (remember the & takes the binary representation of these numbers). So we have:

100 = 4
011 = 3

Imagine these values being stacked up much like elementary addition. The & operator says that if both values are equal to 1 then the result is 1, otherwise it is 0. So 1 & 1 = 1, 1 & 0 = 0, 0 & 0 = 0, and 0 & 1 = 0. So we do the math:

100
011
----
000

The result is simply 0. So we go back and look at what our return statement now translates to:

return (4 != 0) && ((4 & 3) == 0);

Which translates now to:

return true && (0 == 0);

return true && true;

We all know that true && true is simply true, and this shows that for our example, 4 is a power of 2.

Best Answer

Related Solutions

Javascript – JavaScript’s highest integer value that a number can go to without losing precision

C# – How to check if a number is a power of 2

Explanation

Related Topic