This is an interview question that I've run across a few times, and I'm really not sure how to solve it given that four numbers are missing. I'm familiar with algorithms for finding one or two numbers are missing, but I don't see a way to generalize either of them to four.
Algorithms – Finding 4 Missing Numbers in a 32-bit File
algorithms
Related Solutions
A closed-form expression (called Binet's formula) for terms of the Fibonacci sequence is well-known: Fₙ = (φⁿ-ψⁿ)/√5, where φ=(1+√5)/2 and ψ=(1-√5)/2. You might be able to prove that formula via the Master theorem for recurrences. In any case, either by inspecting Binet's formula or by numerically calculating ratios of consecutive terms of the Fibonacci sequence, you should quickly observe that Fₙ is on the order of φ times larger than Fₙ₋₁. Since φ is about 1.618, this implies that Fₙ grows rapidly enough with n that it soon exceeds 4 million.
Here is an example of some python code that computes Fibonacci values less than a limit.
def smallFib(fmax):
a, b = 0, 1
while b<fmax:
print b
a, b = b, a+b
Run the code (that is, enter the code into a python interpreter, and then say smallFib(4000000)
) and you will see that in fact only about 33 terms of the Fibonacci sequence are less than 4 million. Thus, it makes sense to just do the simple thing, and add up the 11 of those terms that are even. (To test if a number is even and add it to a sum if so, you can say if not b&1: sum += b
.)
Note that Binet's formula does not give any apriori information about whether a term will be even or odd. As it happens, every third Fₙ is even, so you could just compute F₃, F₆, F₉..., but to know that every third Fₙ is even you need to either inspect the numbers or prove something about Fibonacci numbers. In many cases, using a slightly-complicated closed formula is not likely to save time compared to computing values via an extremely simple recurrence.
Summary: Don't use a complicated approach when a simple approach will do. Learn an interpreted language like python so that you can run simple tests to find out if a simple approach is good enough.
Disclaimer: This answer does not answer the question that completely. But it's too long for a comment.
NP-hard? I believe, this problem could be NP-hard.
Consider a special case of the Knapsack problem:
Given a set of positive integers and a positive integer b, does there exist a subset of the set such that the sum of all integers in that subset equals b?
This sounds somewhat similar to our Countdown problem, and it seems to be much simpler. However, Knapsack (and this special case of Knapsack) is NP-hard (and NP-complete, of course).
I did not manage to use this for a proof that Countdown is NP-hard. I could not get rid of the division. Consider we have a thousand 2, and b = 7. This will never be solvable with Knapsack, but always (?) with Countdown, at least in all the ways I tried to transfer the problem.
Now, if Countdown really was NP-hard, we could deduce that there with very high probability is no algorithm that is significantly more efficient than brute-force trying all possibilities. (And if we should find such an algorithm, we will become very famous.)
No, I don't think there must be an efficient algorithm.
Heuristics. The Youtube video linked in the question has a nice example: The contestant found an exact answer 952 = ((100 + 6) * 3 * 75 - 50) / 25. That is completely against my intuition, I would never have tried it that way in the first time: Produce a very large number, then divide it and yield the result.
On the other, we humans feel that we do not need to try (arbitrary example) 50 * 75 * 100 / 2 / 3 / 7 to reach a three-digit-number. But computers do not feel anything, they simple calculate.
After all, if we implemented some heuristics, and this heuristics does not find an exact solution, we still will have to try all other solutions to make sure there really is none.
What the contestant in the Youtube video does is, I think, to very quickly check a large number of possibilities and to quickly discard those that will not (or likely will not) give a solution.
Conclusion. When implementing an algorithm, one could take care to strip equal calculations such as a / b / c = a / (b * c), but I think this is quite hard to do and I do not know whether this improves the runtime significantly.
Computers of course are faster than humans in checking a large number of possibilities. And nowadays, even smartphones are so fast they can solve this problem, I think, within a second by simply trying all possibilities. (I did not test this.) There are only six numbers, it would be different if there were e.g. 60 of them.
Best Answer
Whether it's for an interview or actual work, your first priority needs to be a working solution that makes sense to you. That usually means you should offer the first solution you can think of that is simple and easy for you to explain.
For me, that means sort the numbers and scan for gaps. But, I work on business systems and web apps. I don't fiddle with bits, and I don't want my team to!
If you're interviewing for a low-level, closer-to-the-metal job, "sorting" will probably be met with blank stares. They want you to be comfortable thinkings about bits and so forth. Your first answer there should be, "Oh, I'd use a Bitmap." (Or bit array, or bit set.)
And then, either way -- even if you give "wrong" solution, if your interviewer (or boss!) presses for it, you can suggest some improvements or alternatives, focusing on the manager's specific area of concern.
Sort it in place, on disk. You can use a mostly-arbitrary amount of RAM to optimize and/or buffer sorted blocks.
Use that RAM! Sorting is already
O(n*log(n))
. (Or O(n) for a integer-bucket sort!)What could be easier than sorting?!
BitSet
/BitMap
/BitArray
)Well OK ... go ahead and use a
BitArray
to flag the "found numbers." And then scan for0
's.Use the bitmap solution. It's a single pass over the file and another pass over the
BitArray
/BitSet
(to find the0
's). That'sO(n)
, I think!Or whatever.
Address the concerns you actually have. Just solve the problem first, using naive solutions if necessary. Don't waste everybody's time addressing concerns that don't exist yet.