Big O Complexity – Problem on Calculating Big O Complexity

algorithmsbig ojava

i have this function on which i have to calculate the time complexity with the Big O notation:

public void print(ArrayList<String> operations, ArrayList<LinkedHashSet<String>> setOfStrings) {
    int numberOfStrings = 0;
    int numberOfLetters = 0;
    String toPrint = operations.get(1);
    for (Iterator<LinkedHashSet<String>> iteratorSets = setOfStrings.iterator(); iteratorSets.hasNext();) {
        LinkedHashSet<String> subSet = iteratorSets.next();
        if (subSet.contains(toPrint)) {
        for (Iterator<String> iterator = subSet.iterator(); iterator.hasNext();) {
            numberOfLetters = numberOfLetters + iterator.next().length();
        }
        numberOfStrings = subSet.size();
        break;
        }
    }
}

the method does this operation:

For example, if have as operation print foo, I have to do these steps,first of all, I have to find where foo is:

  • Inside setOfStrings, I can have this situation:

            position 1 : [car, tree, hotel]
            ...
            position n : [lemon, coffee, tea, potato, foo]
    
  • When I find the string foo, I have to save the number of strings inside that position and the number of letters of each string, so in this case, I will save:

       5(number of strings) 23(sum of number of letters)
    

some considerations:

  1. For the arrayList of operations, I get always a specific position, so I don't iterate. It is always O(1).

  2. For the ArrayList<LinkedHashSet<String>> , I have to iterate, so the complexity in the worst case is O(n)

  3. the operation if (subSet.contains(toPrint)), it will be O(1),because hashSet has mapped all objects inside it.

  4. the iteration inside the hashset made with for (Iterator<LinkedHashSet<String>> iteratorSets = setOfStrings.iterator(); iteratorSets.hasNext();) , it will be O(m),because i have to iterate inside the entire hashset to sum the letters of each words

so in conclusion i think the time complexity of this algorithm is(O(n)*O(m))

are these considerations all corrects?
thanks.

Best Answer

It is a bit more complicated than that. The worst-case complexity is O(M * N), and the best-case complexity is O(N).

There are two worst-case scenarios:

  • when every subset contains the toPrint String, or
  • when the subsets contain strings that all has to the same value.

(The second one is extremely unlikely, unless someone deliberately populates the data structure with data with that property. But it is still a case that needs to be considered in a thorough complexity analysis.)

The best-case scenario is when strings in the subsets hash "nicely" AND the probability of the contains test returning true tends to zero.

Finally, N is the size of setOfStrings, and M is the average size of a subset.

Related Topic