Recursive trees

September 13, 2009 19:05
In my trudge towards what I think might mean being a good developer, I have finally started studying algorithms. Thanks to MIT and its OpenCourseWare program.

Here they go - Introduction to Algorithms video lectures with transcripts, lecture materials and assignments. Pure, unadulterated knowledge, available
for anyone, for free.

Currently the plan is to watch one lecture per week,
implement algorithms in C# and do assignments. (Don't fret, I'm not going to blog about it too much here - unless you'd ask to.) Have already done a few lectures and the whole thing looks challenging but doable.

At least now I know enough to update my fluffy explanation on why "divide and conquer" algorithms have
O(n*lg(n)) complexity[1].

Divide and conquer, revisited.

As I said, if your recursive algorithm divides the input in two parts, works on each independently and then combines the results, you'll probably get that O(n*lg(n)).So there's a neat and simple way to explain, why.

Suppose that the overall running time of processing n items is T(n). That time is made up from
  • processing the first half
  • processing the second half 
  • and merging the result
In mathematical form, we can put it as T(n) = T(n/2) + T(n/2) + c*n where c is some arbitrary huge constant.

Let's write that 
down as a tree with c*n at the root and recursive parts as leaves.

                  cn
 
T(n) =        /    \
            T(n/2)  T(n/2)


And now we can expand each leaf node - because T(n/2) = T(n/4) + T(n/4) + c*n/2 and so on. So, we move on expanding our T(n) and get something that's called recursive tree:

                   cn
              /         \
            cn/2        cn/2
           /   \                  /   \
         cn/4  cn/4      cn/4  cn/4
         / \        / \    / \       / \
        ...  ...  ...   ...  ...
       /                       \
      O(1)  O(1)   ...    ...  O(1)


It's a normal binary tree, so it has
lg(n) levels.

The amount of leaves is a bit more tricky. Nodes of each level always add up to cn -
first time it's just cn, then it's cn/2 + cn/2 and so on. Thus, the last level will be O(cn) that is in fact just O(n).

On the other hand, the amount of items in the tree is our T(n) - that is, performance of the recursive algorithm in question. So
T(n) = O(n*lg(n)).



Footnotes.
  1. Read about Big O here, and my humble post is here. Merge sort is a good example of "divide and conquer".

Comments are closed