code
vanced
Advanced solutions in .NET
Search
Include comments in search
Talks
Archive
About
Say Of The Day
If you get an API right, code will read like prose.
-
Joshua Bloch
Tags
agile
algorithms
api design
aspnet mvc
bla-bla-bla
c#
knetlik
metrics
mocking
moq
ndepend
nmock2
performance
pex
productivity
rants
resharper
rhino mocks
typemock
unit testing
visual studio
whysharper
Medals
Blogroll
Mads Kristensen
Eric Lippert
Joel Spolsky
Dimitri Glazkov
Lean Thinking
Eldar Musaev (russian)
Steve McConnell
<< Introduction to Pex Stubs
|
Resharper presentation >>
Estimation, The second kind of
July 19, 2009 19:46
We developers are often asked how long it would take to
finish
a piece of code. We've become quite skilled at that. We look at the scope,
get a rough estimation, adjust it to have time on any
issues we might face, add up time for comunication, add coffee breaks, multiply the result by 3 and let business to decide whether they still want the feature.
What we are asked much less often is
how long it would take to
run
a piece of code.
Why is that important?
Your code might take a second to process 100 items but will run for hours if you feed it with 500. It's a lot easier to address the issue while
implementing
the algorithm, rather than fixing a critical performance bug for the already deployed solution.
It's difficult to predict how big the input can be. But quite often all you need is a quick confirmation that the your solution is
fast enough
, without plodding away at writing performance tests. It's not that performance tests are bad - indeed you must have them. I'm just saying there are performance problems that can be easily noticed without any tests.
The Big O notation
[1]
is your friend here.
The O() notation.
We think about it as an "order of". Imagine your method processes 1 item in 1 second. How fast will it process 50 items?
If you're lucky to have a
O(1)
algorithm, it will still take 1 second. If the code is
O(n)
then have a quick chat with a friend but get back in 50 seconds. If your method is
O(n
2
)
you might want to go to lunch - the results will be available in some 40 minutes. But if it's
O(2
n
)
make sure you have something to do for the next 35702051 years.
Below is a picture with several most common complexities.
Common Sense Estimation.
A nice part of all this is that you don't need to use formal mathematic methods to find out the complexity of your algorithm
. Rely on common sense
[2]
.
Simple access: O(1).
If
you access an element in an array or hashtable, your algorithm is
O(1)
. Why? Arrays are stored contiguously in memory, so accessing an element by index is a matter of putting together index number and base address of the array. Roughly the same is for hash tables - the hash you specify identifies the busket where you'll find the value.
Simple loops: O(n).
If you simply iterate through a list
, your algorithm is
O(n)
.
Why? Because the more elements you add to the list, the more iterations you have to do.
Nested loops: O(n
2
).
If you iterate through
a list
and within each iteration go through another list, you get a
O(n
2
)
algorithm. Why? Each nested loop adds up one
O(n)
per each iteration of the outer loop. Thus, you have
O(n
1
)*O(n
2
)
where
n
1
and
n
2
are amounts of elements in the lists but as we're talking about very big values of
n
, we can say
n
1
==n
2
.
Binary chop: O(lg(n)).
If you chop the list in two halves and work with one half only - which you will chop again and so on - you'd get a
O(lg(n))
algorithm. Why? Each iteration halves the size of the array. Thus, the formula for array size is
n = 2
iterationCount
and if you swap the equation and take logs of each side of this equation you'll get
lg(2
iterationCount
)
= lg(n)
. By the definition of logs it's
iterationCount
= lg(n)
.
Divide and conquer: O(n*lg(n)).
If you use a multi branched recursion (divide the list in two or more parts, work on each independently and then combine the results), you'll probably get
O(n*lg(n))
algorithm. Why? Each time you divide the list into two pieces. Thus, you can make
O(lg(n))
nested calls on each piece before reaching a sublist with only one element. On each step, you need
O(n)
calls to merge the results. To get the overall complexity, multiply both.
[Update: better explanation
here
]
Footnotes.
Here go wikipedia articles -
The Big O
and
computational complexity
.
Just found that
Algorithm Speed
chapter from
The Pragmatic Programmer
is available online
.
Comments are closed