Algorithms – Compute k-Permutation Without Repeating and No Duplicates

algorithmscombinatorics

I have a set of integers, for example 1..20 and I want to get all possible combinations of that set drawing 5 numbers.

So each combination has 5 numbers from that set of integers. But I don't want the numbers in the combination to be duplicates and I want the unordered combination to be unique.

I guess it's not easy to understand, my english is not the best and I don't know all the mathematical terms. So here's an illustration.

My set is 1..20 and obviously has a length of 20.

I want each possible combination to be like [1, 2, 3, 4, 5], [1, 2, 3, 4, 6], … , [1, 4, 8, 14, 20], always a length of 5.

And this should be prohibited: [1, 2, 3, 4, 5], [3, 2, 1, 5, 4] since those two contain the exact same numbers.

Also this is not allowed: [1, 1, 2, 3, 4] since this combination contains duplicate numbers itself.

So a naive approach would be to iterate via 5 for loops over 20, save all the previous combinations and check for duplicates through the combinations and within the newest combination. But this would result in 3,200,000 loops, which is not very efficient, since the final number of combinations is 1,860,480 (if I didn't calculate wrong, but it should be n! / ((n -k)! * k!) if I'm not totally wrong, where n = 20 and k = 5).

So how can I eliminate the duplicate loops in the first place?
I need a algorithm that is fast and does not take unnecessary steps since I'll calculate stuff with this data that takes a while and so I want to reduce the amount of loops as far as possible.

Best Answer

Unless I'm mistaken:

  • There are 20^5 = 3,200,000 ways to pick 5 numbers out of 20 (with repeating, with duplicates)
  • If you don't allow for "duplicates" (picking the same number more than once), you're left with n! / (n - k)!, in this case 20! / (20 - 5)! = 1,860,480. These are k-permutations.
  • If, then, you don't allow for "repeating" (having collections with the same elements only in a different order), you're left with n! / ((n - k)! * k!), in this case 20! / ((20 - 5)! * 5!) = 15,504. These are called k-combinations. These are what you seem to be looking for.

If you are only looking for combinations of size 5 and want access to the entire set of combinations immediately, then I'd say then 5 nested loop approach is fine, and I don't think you'll be able to do anything faster. You just need to modify your idea a bit so you wont need to check for uniqueness or duplicate numbers and don't do unnecessary iterations. You can produce (only) ordered combinations by doing something like:

public Set<Set<Integer>> all5CombinationsOf20(){
    Set<Set<Integer>> result = new HashSet<Set<Integer>>();
    for(int i = 1; i <= 16; i++){
        for(int j = i+1; j <= 17; j++){
            for(int k = j+1; k <= 18; k++){
                for(int l = k+1; l <= 19; l++){
                    for(int m = l+1; m <= 20; m++){
                        result.add(new HashSet<Integer>(Arrays.asList(i,j,k,l,m)));
                    }
                }
            }
        }
    }
    return result;
}

Since this gives you the combinations in a ordered way [1, 2, 3, 4, 5] is produced, but [3, 2, 1, 5, 4] and [1, 1, 2, 3, 4] are not.

However this approach is not extendable to combinations of arbitrary size, then you would need recursion. I once wrote this:

public static <T> Set<Set<T>> combinations(Set<T> alphabet, int k) {
    if (alphabet.size() == 0) {
        throw new IllegalArgumentException("alphabet.size must be > 0");
    }
    if (k < 0) {
        throw new IllegalArgumentException("k must be >= 0");
    }
    if (k == 0) {
        Set<Set<T>> result = new HashSet<Set<T>>();
        result.add(new HashSet<T>());
        return result;
    } else {
        Set<Set<T>> result = new HashSet<Set<T>>();
        for (T e : alphabet) {
            Set<T> newAlphabet = new HashSet<T>(alphabet);
            newAlphabet.remove(e);
            Set<Set<T>> smallerCombinations = combinations(newAlphabet, k - 1);
            for (Set<T> smallerCombination : smallerCombinations) {
                Set<T> combination = new HashSet<T>(smallerCombination);
                combination.add(e);
                result.add(combination);
            }
        }

        return result;
    }
}

This is not necessarily fast since you are messing around the alphabet a lot, but the idea should be clear: to make a combination of size n over a certain alphabet (in your case 1..20), remove an element e from the alphabet, make a combination of size n-1 over the alphabet minus e and return the combination with e added. Reusing the same alphabet for smaller combinations an letting the HashSet get rid of duplicates would be faster.

Also these sets of combinations get large pretty fast, so you might want to be lazy and not hold them all in memory (depending one how you want to use them), if so, check https://stackoverflow.com/questions/127704/algorithm-to-return-all-combinations-of-k-elements-from-n

Related Topic