Algorithms – Finding 4 Missing Numbers in a 32-bit File

algorithms

This is an interview question that I've run across a few times, and I'm really not sure how to solve it given that four numbers are missing. I'm familiar with algorithms for finding one or two numbers are missing, but I don't see a way to generalize either of them to four.

Best Answer

Whether it's for an interview or actual work, your first priority needs to be a working solution that makes sense to you. That usually means you should offer the first solution you can think of that is simple and easy for you to explain.

For me, that means sort the numbers and scan for gaps. But, I work on business systems and web apps. I don't fiddle with bits, and I don't want my team to!

If you're interviewing for a low-level, closer-to-the-metal job, "sorting" will probably be met with blank stares. They want you to be comfortable thinkings about bits and so forth. Your first answer there should be, "Oh, I'd use a Bitmap." (Or bit array, or bit set.)

And then, either way -- even if you give "wrong" solution, if your interviewer (or boss!) presses for it, you can suggest some improvements or alternatives, focusing on the manager's specific area of concern.

  • Severely limited RAM? Less than 512MB?
    Sort it in place, on disk. You can use a mostly-arbitrary amount of RAM to optimize and/or buffer sorted blocks.
  • Limited time?
    Use that RAM! Sorting is already O(n*log(n)). (Or O(n) for a integer-bucket sort!)
  • Maintainability?
    What could be easier than sorting?!
  • Doesn't demonstrate knowledge of bit flags/fields? (BitSet/BitMap/BitArray)
    Well OK ... go ahead and use a BitArray to flag the "found numbers." And then scan for 0's.
  • Predictable "real-time" complexity?
    Use the bitmap solution. It's a single pass over the file and another pass over the BitArray/BitSet (to find the 0's). That's O(n), I think!

Or whatever.

Address the concerns you actually have. Just solve the problem first, using naive solutions if necessary. Don't waste everybody's time addressing concerns that don't exist yet.