So let's talk about the big O notation of BogBogSort:
http://www.dangermouse.net/esoteric/bogobogosort.htmlActually, let's build up to this first.
O() notation is basically a way to say that one function looks like another function as x gets large. This is determined in the form of...
if
f(x) = O(g(x))
That means there exists a constant C such that
f(x) <= C*g(x)
So for instance, all quadratic equations are O(x^2). Doesn't matter what any of the constants are. And in general O() notation works great for figuring out that polynomials grow at a similar rate. It works poorly for faster growing functions, though.
Consider these two functions:
f(x) = 2^x
g(x) = 4^x
Obviously they're more or less the same function, or at least you can rephrase problems to look like the other one. Like...let's say instead of x representing individual people, it represents married couples--bam, just switched the algorithm from 2^x to 4^x, even though we're still talking about the same code solving the same problem. Despite that, they are not O equivalent
4^x != O(2^x)
This is kind-of a failing of the O system, and one that honestly wouldn't be too hard to solve. (Instead of using one constant in the definition, you'd use two, ex:
f(x) <= C*g(D*x)
But it doesn't. And this is where it stops being a useful tool. BUT we're still going to do calculations about it because math is fun.
It gets even worse where we're going.
(n+1)! != O(n!)
Yep, literally the same function shifted over by 1, is not even recognized as being in a similar category of functions by the function categorization tool O(f). But like I said, we're going to use it anyway. I'm just pointing out the n+1 stuff, because it's annoying and it's gonna come up.
So...let's start with a practice case. O(n!) is kinda similar to n^n (in that it's somewhere between n^(n/2) and (n/2)^n). Of course, they're not O() equivalent, because of all that stuff I just mentioned above. So...what IS O() equivalent?
Well...I don't really have particularly good tools at my disposal to handle products, but I do have good tools at my disposal to handle sums, so let's just take the log of everything.
ln(n!) = ln(1) + ln(2) + ln(3) + ... + ln(n)
Alright...well...I don't really have a great way to sum all that stuff, but I can integrate it.
integral( ln(x) ) = x ln(x) - x
Alright, so what is the evaluation range of this integral? Well it's from 1, to n+1 instead of 0 to n. (Really, you don't want to integrate the area between 0 and 1). So let's plug that in.
(n+1)ln(n+1) - (n+1) - (0 - 1)
= (n+1)(ln(n+1)-1) + 1
And then raising e to the power of all of this...
((n+1)/e)^(n+1) * e
Alright cool, glad that worked out. Only one problem--this still isn't O() equivalent to n!. You see...when we approximated sum( ln(x) ) as integral( ln(x) ), that's not accurate. We're adding too much. So...as a second order approximation, for each tile we want to subtract like...a triangle.
So...equation for a triangle area here would be
1/2 * (ln(n+1) - ln(n))
And I could approximate this with the derivative of ln(x), and then integrate it, but this is one case where summing over n is quite easy, because this happens:
(ln(n+1) - ln(n)) + (ln(n) + ln(n-1)) + .....
= ln(n+1) - ln(1) = ln(n+1)
Bam, and now we can plug that into the previous equation
(n+1)(ln(n+1)-1) + 1 - 1/2 ln(n+1)
= n ln(n+1) + ln(n+1) - n - 1 + 1 - 1/2 ln(n+1)
= n ln(n+1) + 1/2 ln(n+1) - n
= n ( ln(n+1) - 1) + 1/2 ln(n+1)
So...translating this back to the real world, we have...
((n+1)/e)^n * sqrt(n+1)
Alright, this is more accurate...but is it accurate enough for big O notation? I mean, we just approximated the curve with diagonal slope lines, but of course it's curves. What if we account for this with second order newtonian approximations?
Well...the integral of an upside down parabola from -1/2 to +1/2 is...the integral of (1/4 - x^2). This is 1/4x - 1/3 x^3.
= 1/4 - (1/3 * 1/8 - 1/3 * (-1/8) ) = 1/4 - 1/12 = 1/6
Alright, so all we need is the second derrivative of the function and then we can sum over that. What is the second derivative of ln(x)? -1/x^2.
Sweet, let's integrate that. And we get....
1/6 * 1/x
And when we change that into exponent form we get...
e^(1/6*1/x)
Alright...so what does O() think of this value? For once, O() does its god damned job and weeds something out. e^(1/x) trends towards a constant value, and constant multiplied values are weeded out by O().
So...we can say conclusively...
((n+1)/e)^n * sqrt(n+1) = O(n!)
AND
n! = O(((n+1)/e)^n * sqrt(n+1))
And...actually the two functions are very close to each other. Like...
1.04 vs 1
2.11 vs 2
6.37 vs 6
25.6 vs 24
128 vs 120
771 vs 720
It drifts a little bit off-course because I haven't accounted for any constants (empirical testing suggests multiply by 0.92 makes it very accurate for high values of x).
Anyway, so we're about ready to tackle bogobogosort. It's 1am and I'm probably not going to finish writing this tonight. However let's get started.
http://www.dangermouse.net/esoteric/bogobogosort.htmlSo there's two functions here, the sort n S(n) and the check n is sorted C(n)
S(n) = n! * C(n)
C(n) = n * S(n-1)
S(n) = n * n! * S(n-1)
(with some smaller terms that O() will actually smash for us. O() is good at handling two functions being added together).
S(n) = n! * product(x!)
S(n) = n! * (1^(n) * 2^(n-1) * 3^(n-2) * ... * n^(1))
so taking the log of what we plan to sum, it looks something like
log(x) + (M-x)*ln(x)
We've already handled integrating ln(x), just to the right hand part.
(M-x)*ln(x)
M ln x - x ln x
M ( x ln x - x ) - ( 1/2 x^2 ln x - 1/4 x^2 )
M^2 ln M - M^2 - 1/2 M^2 ln M + 1/4 M^2 - ( - M + 1/4 )
1/2 M^2 ln M - 3/4 M^2 + M - 1/4
In these equations M is the peak of the integral (1+n). Let's add the preliminary version of n!, since we're integrating over x.
So translating this to actually multiplicative stuff...
n! * (sqrt(n+1))^((n+1)^2) / e^(3/4*M^2) * e^M * e^1/4
Obviously this is just zeroth' orcer,approximation, and we'll probably need two derivatives. Still, though...
(And that's where I'll stop for tonight).