Big O Notation
by Mark Storer · in Technical Issues · 05/05/2005 (9:45 am) · 8 replies
I mentioned "Big O" notation in the "Splitting an Int" thread, and feared many readers wouldn't know what I was talking about. A google search didn't turn up any descriptions I cared for, so here I am, writing my own.
PS: If anyone who actually graduated from a CS program (as opposed to making it half-way through like me) spots some flaws in my understanding, please chime in.
"Big O" notation is a formal way of describing how efficient an algorithm is in general terms... in Orders of magnitude (which is where the O comes from). An order of magnitude is basically another digit in the number used to describe a value. 1, 10, 100, 1000, etc. If you happen to be using binary instead of decimal, an order of magnitude isn't quite as earth-shaking, but still nothing to sneaze at.
Lets start with an example.
Not much going on there. No matter what you pass "foo", you'll always get the same execution time. That's big O of 1, written "O(1)". That's "constant time", which is the best there is as far as Big O is concerned. Big O doesn't say anything about HOW LONG that constant time might be, only that it is constant. That constant time could be 40 years or 2 nanoseconds, but so long as it's constant, it's O(1).
Moving on:
This function will take an ammount of time that is directly proportional to the ammount of stuff passed in (in this case an array of strings, but it could just as easily be anything: a linked list of game objects, an int, whatever).
As 'n' (numStrs in this example) grows, so grows our execution time. That's O(n). This is pretty decent, as far as Big O notation goes.
Up next we have O(n^2):
The execution time of this function will be proportional to numLists * numStrs, but Big O doesn't care... it abstracts them away to a single value. O(n^2).
You can see how this is going. As you get deeper and deeper loops, you get bigger and bigger powers.
There are also Big O values that aren't straight powers of 'n'. They're based on the Log2(n). Searching a sorted array, or traversing a binary search tree are both examples of O(log2(n)). A sorted list of 16 elements will take at most 4 tests to find what you're after.
A binary sort takes O(n*log2(n))... which is quite an improvement over bubble-sort's O(n^2):
Big O doesn't care about constant multipliers. If there are 40 functions, and each one is O(n), that's still O(n). There's no such thing in Big-O-ville as O(40n). Or O(n/2).
That about covers it.
When I was taught Big O, it was in a "discrete math" class. I memorized all the rules, but they made absolutely no sense to me. The next semester, I walked into my data structures class, and in the first 10 minutes, it all fell into place. The instructor gave some search example and said "This is O(n)". BANG. The light went on. By giving you examples along the way, I'm hoping most of you can skip the semester of memorization, and skip to the "I Get It".
Any questions class? There will be a short quiz next period.
PS: If anyone who actually graduated from a CS program (as opposed to making it half-way through like me) spots some flaws in my understanding, please chime in.
"Big O" notation is a formal way of describing how efficient an algorithm is in general terms... in Orders of magnitude (which is where the O comes from). An order of magnitude is basically another digit in the number used to describe a value. 1, 10, 100, 1000, etc. If you happen to be using binary instead of decimal, an order of magnitude isn't quite as earth-shaking, but still nothing to sneaze at.
Lets start with an example.
int foo( int i )
{
return i;
}Not much going on there. No matter what you pass "foo", you'll always get the same execution time. That's big O of 1, written "O(1)". That's "constant time", which is the best there is as far as Big O is concerned. Big O doesn't say anything about HOW LONG that constant time might be, only that it is constant. That constant time could be 40 years or 2 nanoseconds, but so long as it's constant, it's O(1).
Moving on:
void bar( char **strs, int numStrs)
{
for( int i = 0; i < numStrs; ++i) {
printf( "%s\n", strs[i] );
}
}This function will take an ammount of time that is directly proportional to the ammount of stuff passed in (in this case an array of strings, but it could just as easily be anything: a linked list of game objects, an int, whatever).
As 'n' (numStrs in this example) grows, so grows our execution time. That's O(n). This is pretty decent, as far as Big O notation goes.
Up next we have O(n^2):
void baz( char ***strs, int numLists, int numStrs )
{
for (int i = 0; i < numLists; ++i) {
for( int j = 0; j < numStrs; ++j) {
printf( "%d: %s\n", i, strs[i][j] );
}
}
}The execution time of this function will be proportional to numLists * numStrs, but Big O doesn't care... it abstracts them away to a single value. O(n^2).
You can see how this is going. As you get deeper and deeper loops, you get bigger and bigger powers.
There are also Big O values that aren't straight powers of 'n'. They're based on the Log2(n). Searching a sorted array, or traversing a binary search tree are both examples of O(log2(n)). A sorted list of 16 elements will take at most 4 tests to find what you're after.
A binary sort takes O(n*log2(n))... which is quite an improvement over bubble-sort's O(n^2):
binary sort bubble sort 10: 40 100 binary is about 2 times faster 100: 700 10000 binary is now around 14 times faster 1000: 10000 1000000 binary is now 100 times faster 10000: 140000 100000000 binary is now about 700 time faster
Big O doesn't care about constant multipliers. If there are 40 functions, and each one is O(n), that's still O(n). There's no such thing in Big-O-ville as O(40n). Or O(n/2).
That about covers it.
When I was taught Big O, it was in a "discrete math" class. I memorized all the rules, but they made absolutely no sense to me. The next semester, I walked into my data structures class, and in the first 10 minutes, it all fell into place. The instructor gave some search example and said "This is O(n)". BANG. The light went on. By giving you examples along the way, I'm hoping most of you can skip the semester of memorization, and skip to the "I Get It".
Any questions class? There will be a short quiz next period.
About the author
#2
Now I have to figure out how to make a "Theta Face"
05/05/2005 (10:09 am)
And here I thought this was going to be a way more interesting thread than it turned out to be :(Now I have to figure out how to make a "Theta Face"
#3
And I've never even heard of Big Theta.
05/05/2005 (10:31 am)
Huh. I thought Big O was worst-case. Live and learn.And I've never even heard of Big Theta.
#4
The idea is that you can use the same function as an upper and a lower bound, if you use different constant factors.
So a true N^2 algo could (for example) be proved to have 0.9n^2 as a lower bound and 1.1n^2 as an upper bound.
05/05/2005 (10:39 am)
Big Theta's usually a few classes up the line from Big O - it takes a few more steps to prove and it's slightly more difficult to explain.The idea is that you can use the same function as an upper and a lower bound, if you use different constant factors.
So a true N^2 algo could (for example) be proved to have 0.9n^2 as a lower bound and 1.1n^2 as an upper bound.
#5
But what Big O fails to mention is the constant; All O(N^2) are not created equal.
Also, you tend to only put in the most significant guff; O(N^2+N) is O(N^2).
Take for example, O(N) where each operation takes sixteen cycles. And O(N^2), where each operation takes just one cycle.
It takes up until N=4 for the O(N) to actually be faster than the O(N^2) algorithm.
You may well find that, depending on the vagaries of your implementation, a shell sort or an insertion sort is *way* faster than a quick sort for anything up to thousands of items; if, in your code, you're only ever sorting eight items, you can probably just do an insertion sort and forget about it.
Gary (-;
05/05/2005 (4:04 pm)
Well, Big O *is* the worst case.But what Big O fails to mention is the constant; All O(N^2) are not created equal.
Also, you tend to only put in the most significant guff; O(N^2+N) is O(N^2).
Take for example, O(N) where each operation takes sixteen cycles. And O(N^2), where each operation takes just one cycle.
It takes up until N=4 for the O(N) to actually be faster than the O(N^2) algorithm.
You may well find that, depending on the vagaries of your implementation, a shell sort or an insertion sort is *way* faster than a quick sort for anything up to thousands of items; if, in your code, you're only ever sorting eight items, you can probably just do an insertion sort and forget about it.
Gary (-;
#6
This is starting to sound like a nursery rhyme.
05/05/2005 (4:31 pm)
Don't forget Big omega. And the little ones ... (little oh, little theta, little omega). This is starting to sound like a nursery rhyme.
#7
What are Big Omega, little oh, little theta, and little omega?
05/06/2005 (9:26 am)
I'm probably going to regret this, but what the heck:What are Big Omega, little oh, little theta, and little omega?
#8
As big oh gives you an upper bound (as described above), big omega gives you a lower bound.
The intersection of those two (they are sets of functions), gives you big theta.
I guess picture this: (supposed to be ovals describing sets)
_____
| |
| | Big omega
| |
--------
( ) Big theta (Partially big omega/big oh)
--------
| |
| | Big oh
| |
---------
Big omega overlaps with big theta, and big oh overlaps with big theta.
The top box by itself (without including big theta), is little omega.
The bottom box by itself (without including big theta), is little oh.
Now where you'd use all this, that's a big i dun-oh.
05/06/2005 (12:52 pm)
I think I got carried away, there is no "little theta".As big oh gives you an upper bound (as described above), big omega gives you a lower bound.
The intersection of those two (they are sets of functions), gives you big theta.
I guess picture this: (supposed to be ovals describing sets)
_____
| |
| | Big omega
| |
--------
( ) Big theta (Partially big omega/big oh)
--------
| |
| | Big oh
| |
---------
Big omega overlaps with big theta, and big oh overlaps with big theta.
The top box by itself (without including big theta), is little omega.
The bottom box by itself (without including big theta), is little oh.
Now where you'd use all this, that's a big i dun-oh.
Torque Owner Mz
Big O is actually an upper bound on the time (or sometimes memory) usage of an algorithm. Something that's O(1) is also O(n), and also O(n^2). It's customary to use the lowest bound you can prove, though.
For a real efficiency measure you want to talk about an asymptotically tight bound, which is commonly called Big Theta notation.