Algorithms – Running Time of Simple For-Loops

algorithmsbig oruntime

I'm reading algorithms and I understand most of it, one thing that I can still struggle with a lot is something as simple as running times on different for-loops. Everyone seems to have easy with that, except for me and therefore I search help here.

I am currently doing some excercises from my book and I need help completing them in order to figure out the different running times.

The title of the exercise is: "Give the order of growth(as a function of N) of the running times of each of the following code fragments"

int sum = 0;
for (int n = N; n > 0; n /= 2)
  for(int i = 0; i < n; i++)
    sum++;

int sum = 0;
for (int i = 1; i < N; i *= 2)
  for(int j = 0; j < i; j++)
    sum++;

int sum = 0;
for (int i = 1; i < N; i *= 2)
  for(int j = 0; j < N; j++)
    sum++;

We have learned different kinds of running times/order of growth like n, n^2, n^3, Log N, N Log N etc. But I have hard understanding which to choose when the for loop differs like it does. the "n, n^2, n^3" is not a problem though, but I can't tell what these for-loops running time is.

Here's an attempt of something.. the y-axis represents "N" value and the x axis represents times the outer loop has been run. The drawings in the left is: arrow to the right = outer-loop, circle = inner loop and the current N value. I have then drawn some graphs just to look at it but I'm not sure this is right though.. Especially the last one where N remains 16 all the time. Thanks.

My drawing

Best Answer

Background

For an arbitrary for loop such as:
```
for (int p = …; …)
    do_something(p);
```
it should be clear that the run time of this loop is simply the sum of the run times of the inner computations (i.e. do_something(p) in this case). Note that the run time of the inner computation may depend on the loop variable(s).

In the special case where the run time of do_something(p) is independent of the loop variable(s), the run-time is then proportional to the number of times the inner computation is executed.
Typically, simple arithmetic such as incrementing (sum++) are constant-time operations.
For brevity I will use log to denote the base-2 logarithm. Additionally, I will use pow(x, y) to denote x raised to the power y because ^ is often used for something else in the C-family of languages.

The problems

The interesting coincidence in this set of problems is that the run time of each computation is actually proportional to the final value of the sum (can you see why?) so the question can be simplified to: how does the final value of sum vary with the parameter N?

The sum can be calculated algebraically, but we don't need to know the exact result. We just need to make some good estimates.

Problem A

int sum = 0;
for (int n = N; n > 0; n /= 2)
  for (int i = 0; i < n; i++)
    sum++;

How many times does the outer loop run? It starts at N and goes down by half each time until it hits zero. This is just (a discrete version of) an exponential decay with a base of 2. Therefore, we can make an educated guess that

n ∝ 1 / pow(2, p)    [approximately]

Here, we define p to be a counter that increases by 1 each time the outer loop is repeated. In fact, this is exactly what your graph plots.

Where does p start? We can just pick p_start = 0 for convenience, and this allows us to determine the coefficient of proportionality:

 n ≈ N / pow(2, p)

Where does p end? Whenever n reaches zero! Since this is integer division, n can only be zero if N < pow(2, p). From this we can deduce that p_end ≈ log(N) (base-2 logarithm). With experience, you can easily skip this entire analysis and jump straight to this conclusion instead.

Now we can rewrite the loop using the variable p instead of n (again, approximately). Notice that every instance of n must be substituted:

int sum = 0;
for (int p = 0; p < log(N); p++)
  for (int i = 0; i < N / pow(2, p); i++)
    sum++;

The advantage of writing the loop like this is that it becomes obvious how many times the outer loop is repeated. (Keen readers may notice that this is

The inner loop consists of only increments to sum and is repeated N / pow(2, p) times, so we can just rewrite the above to:

int sum = 0;
for (int p = 0; p < log(N); p++)
  sum += N / pow(2, p);

(Note that the run time of this loop may no longer be the same, but the value of sum still reflects the run time of the original problem.)

From this code we can write the value of sum as:

$sum ≈ ∑[p = 0 to log(N)] (N / pow(2, p))$

(As randomA noted, this is just a geometric series so there is a well-known closed-form expression for the sum. Here I use a more general technique based on calculus, however.)

This can be simplified further by approximating the summation as an integral:

$sum ≈ ∫[0 to log(N)] (N / pow(2, p)) dp ≈ N / ln(2)$

And there you go, the run time of the problem A is linear with respect to N.

Problem B

int sum = 0;
for (int i = 1; i < N; i *= 2)
  for (int j = 0; j < i; j++)
    sum++;

The problem here is similar, but I'll omit some of the steps. The variable i grows exponentially, so we can do the transformation:

i ≡ pow(2, p)

with p increasing by one at each iteration, starting from 0, and ending at log(N). After variable substitution, the loop becomes:

int sum = 0;
for (int p = 0; p < log(N); p++)
  for (int j = 0; j < pow(2, p); j++)
    sum++;

which reduces to:

int sum = 0;
for (int p = 0; p < log(N); p++)
  sum += pow(2, p);

You can apply the same tricks again to find a closed-form expression for the sum: it is also O(N).

Problem C

int sum = 0;
for (int i = 1; i < N; i *= 2)
  for (int j = 0; j < N; j++)
    sum++;

This one is actually quite a bit easier because the number of repeats of the inner loop doesn't depend on the outer loop variable, so we can go right away and simplify this to:

int sum = 0;
for (int i = 1; i < N; i *= 2)
  sum += N;

The loop has the same exponential behavior as in Problem B, so it is run only log(N) times, so this can then be simplified to:

int sum = 0;
sum += log(N) * N;

Hence, the run time is O(N log(N)).

Best Answer

Background

The problems

Problem A

Problem B

Problem C

Related Solutions

Algorithm Complexity – Single For-Loop Runtime Explanation Problem

Is this a sorting algorithm faster than O(n*log(n))

Related Topic