I'm reading algorithms and I understand most of it, one thing that I can still struggle with a lot is something as simple as running times on different for-loops. Everyone seems to have easy with that, except for me and therefore I search help here.
I am currently doing some excercises from my book and I need help completing them in order to figure out the different running times.
The title of the exercise is: "Give the order of growth(as a function of N) of the running times of each of the following code fragments"
a:
int sum = 0;
for (int n = N; n > 0; n /= 2)
for(int i = 0; i < n; i++)
sum++;
b:
int sum = 0;
for (int i = 1; i < N; i *= 2)
for(int j = 0; j < i; j++)
sum++;
c:
int sum = 0;
for (int i = 1; i < N; i *= 2)
for(int j = 0; j < N; j++)
sum++;
We have learned different kinds of running times/order of growth like n, n^2, n^3, Log N, N Log N etc. But I have hard understanding which to choose when the for loop differs like it does. the "n, n^2, n^3" is not a problem though, but I can't tell what these for-loops running time is.
Here's an attempt of something.. the y-axis represents "N" value and the x axis represents times the outer loop has been run. The drawings in the left is: arrow to the right = outer-loop, circle = inner loop and the current N value. I have then drawn some graphs just to look at it but I'm not sure this is right though.. Especially the last one where N remains 16 all the time. Thanks.
Best Answer
Background
For an arbitrary
for
loop such as:it should be clear that the run time of this loop is simply the sum of the run times of the inner computations (i.e.
do_something(p)
in this case). Note that the run time of the inner computation may depend on the loop variable(s).In the special case where the run time of
do_something(p)
is independent of the loop variable(s), the run-time is then proportional to the number of times the inner computation is executed.Typically, simple arithmetic such as incrementing (
sum++
) are constant-time operations.For brevity I will use
log
to denote the base-2 logarithm. Additionally, I will usepow(x, y)
to denotex
raised to the powery
because^
is often used for something else in the C-family of languages.The problems
The interesting coincidence in this set of problems is that the run time of each computation is actually proportional to the final value of the
sum
(can you see why?) so the question can be simplified to: how does the final value ofsum
vary with the parameterN
?The
sum
can be calculated algebraically, but we don't need to know the exact result. We just need to make some good estimates.Problem A
How many times does the outer loop run? It starts at
N
and goes down by half each time until it hits zero. This is just (a discrete version of) an exponential decay with a base of 2. Therefore, we can make an educated guess thatHere, we define
p
to be a counter that increases by 1 each time the outer loop is repeated. In fact, this is exactly what your graph plots.Where does
p
start? We can just pickp_start = 0
for convenience, and this allows us to determine the coefficient of proportionality:Where does
p
end? Whenevern
reaches zero! Since this is integer division,n
can only be zero ifN < pow(2, p)
. From this we can deduce thatp_end ≈ log(N)
(base-2 logarithm). With experience, you can easily skip this entire analysis and jump straight to this conclusion instead.Now we can rewrite the loop using the variable
p
instead ofn
(again, approximately). Notice that every instance ofn
must be substituted:The advantage of writing the loop like this is that it becomes obvious how many times the outer loop is repeated. (Keen readers may notice that this is
The inner loop consists of only increments to
sum
and is repeatedN / pow(2, p)
times, so we can just rewrite the above to:(Note that the run time of this loop may no longer be the same, but the value of
sum
still reflects the run time of the original problem.)From this code we can write the value of
sum
as:(As randomA noted, this is just a geometric series so there is a well-known closed-form expression for the sum. Here I use a more general technique based on calculus, however.)
This can be simplified further by approximating the summation as an integral:
And there you go, the run time of the problem A is linear with respect to
N
.Problem B
The problem here is similar, but I'll omit some of the steps. The variable
i
grows exponentially, so we can do the transformation:with
p
increasing by one at each iteration, starting from0
, and ending atlog(N)
. After variable substitution, the loop becomes:which reduces to:
You can apply the same tricks again to find a closed-form expression for the sum: it is also
O(N)
.Problem C
This one is actually quite a bit easier because the number of repeats of the inner loop doesn't depend on the outer loop variable, so we can go right away and simplify this to:
The loop has the same exponential behavior as in Problem B, so it is run only
log(N)
times, so this can then be simplified to:Hence, the run time is
O(N log(N))
.