I have a set of integers, for example 1..20 and I want to get all possible combinations of that set drawing 5 numbers.
So each combination has 5 numbers from that set of integers. But I don't want the numbers in the combination to be duplicates and I want the unordered combination to be unique.
I guess it's not easy to understand, my english is not the best and I don't know all the mathematical terms. So here's an illustration.
My set is 1..20
and obviously has a length of 20
.
I want each possible combination to be like [1, 2, 3, 4, 5]
, [1, 2, 3, 4, 6]
, … , [1, 4, 8, 14, 20]
, always a length of 5.
And this should be prohibited: [1, 2, 3, 4, 5]
, [3, 2, 1, 5, 4]
since those two contain the exact same numbers.
Also this is not allowed: [1, 1, 2, 3, 4]
since this combination contains duplicate numbers itself.
So a naive approach would be to iterate via 5 for
loops over 20, save all the previous combinations and check for duplicates through the combinations and within the newest combination. But this would result in 3,200,000 loops, which is not very efficient, since the final number of combinations is 1,860,480 (if I didn't calculate wrong, but it should be n! / ((n -k)! * k!)
if I'm not totally wrong, where n = 20
and k = 5
).
So how can I eliminate the duplicate loops in the first place?
I need a algorithm that is fast and does not take unnecessary steps since I'll calculate stuff with this data that takes a while and so I want to reduce the amount of loops as far as possible.
Best Answer
Unless I'm mistaken:
20^5 = 3,200,000
ways to pick 5 numbers out of 20 (with repeating, with duplicates)n! / (n - k)!
, in this case20! / (20 - 5)! = 1,860,480
. These are k-permutations.n! / ((n - k)! * k!)
, in this case20! / ((20 - 5)! * 5!) = 15,504
. These are called k-combinations. These are what you seem to be looking for.If you are only looking for combinations of size 5 and want access to the entire set of combinations immediately, then I'd say then 5 nested loop approach is fine, and I don't think you'll be able to do anything faster. You just need to modify your idea a bit so you wont need to check for uniqueness or duplicate numbers and don't do unnecessary iterations. You can produce (only) ordered combinations by doing something like:
Since this gives you the combinations in a ordered way [1, 2, 3, 4, 5] is produced, but [3, 2, 1, 5, 4] and [1, 1, 2, 3, 4] are not.
However this approach is not extendable to combinations of arbitrary size, then you would need recursion. I once wrote this:
This is not necessarily fast since you are messing around the alphabet a lot, but the idea should be clear: to make a combination of size n over a certain alphabet (in your case 1..20), remove an element e from the alphabet, make a combination of size n-1 over the alphabet minus e and return the combination with e added. Reusing the same alphabet for smaller combinations an letting the HashSet get rid of duplicates would be faster.
Also these sets of combinations get large pretty fast, so you might want to be lazy and not hold them all in memory (depending one how you want to use them), if so, check https://stackoverflow.com/questions/127704/algorithm-to-return-all-combinations-of-k-elements-from-n