Kolmogorov Complexity - A Primer

chimeracoder · on April 23, 2012

> So in a sense, the huge body of mathematics underlying probability has already failed us at this basic juncture because we cannot speak of how random one particular outcome of an experiment is.

This reminds me of the 'Is 14 a random number?' debate.

Part of the problem, though, is that we have terminology that is horribly misleading.

First, a random variable is neither random nor a variable. They're not variables, because they're functions that yield non-deterministic values - a huge distinction. You can't have a random number - the idea itself doesn't make any sense.

Second, the definition of 'random' is itself problematic - or at least contested. Randomness implies probability, but probability can be defined in two different (and incompatible) ways, one of which is essentially the inverse of the other. Ironically, the one that Keynes proved back in 1921 to be logically inconsistent is the one that is more commonly used today (probably because it is more intuitive and mathematically convenient... even if it is almost always incorrect!).

The question 'Is 14 a random number' doesn't really make any sense. 14 cannot be a random number; it can only be a number drawn from a random distribution. That may seem like it simply begs the question, but in fact, this subtle shift is incredibly important - determining whether a number was drawn from a random distribution is much easier to frame in terms of probability, and probability, not randomness, is the language of statistics.

Unfortunately, this is one of those cases where the two definitions of probability yield widely different answers. You could tell me that the answer is undefined, in almost exactly the same way that division-by-zero is undefined in mathematics. Or you could construct a model over all possible distributions of numbers, the probabilities associated with each of those distributions, and integrate accordingly to yield some (probably computationally unfeasible) functional answer.

In this school of thought, we can speak of how 'random' a particular outcome is - we're essentially partitioning the (potentially infinite) universe of functions f such that our value v is in the range of f into two categories: one designated as 'random' and the other designated as 'not random'. Then, we are determining the probability that our value was generated from one of the former, as opposed to the latter.

Either one would be correct ways of answering this second question, but neither one addresses the first question, which is essentially nonsensical. (Well, I guess the answer is 'no, 14 is not a random number, because a number cannot be random', but that's a bit of a cop-out!).

is74 · on April 23, 2012

One of the goals of mathematical rigor is to create formal structure that are somewhat consistent with our intuitions. The usual notions of randomness do not capture the common intuition that 1000000000000 is less random than 101010100101. Kolmogorov complexity is a formalization of this intuition, and it can be shown to have many relationships to regular randomness (for example: a string drawn from the uniform distribution over strings is very likely to have a Kolmogorov complexity close to the string's length).

MathProgramming · on April 23, 2012

I think I made the point to say that random only applies to things being drawn from a distribution, but that's really the whole point of the theory: we _want_ to be able to talk about how random a number is! We just call it 'Kolmogorov random' now, and by luck or insight, numbers chosen uniformly at random are almost always Kolmogorov random.

chimeracoder · on April 23, 2012

I'm assuming you're the author of the original post. My comment wasn't meant to invalidate your main argument, since it runs mostly orthogonal to the bulk of what you're saying. If anything, it actually supports it slightly (see below).

I understand what you meant by that example; I was just intrigued by seeing it in this context, and it reminded me of the related example that I brought up. And not to belabor the point with pedantry, but a tiny bit of rephrasing illustrates that your statement can be viewed as completely compatible with my latter definition:

> numbers chosen from a random uniform distribution function appear to be indistinguishable from numbers chosen from a Kolmogorov random function

From the looks of it, it appears that the Kolmogorov example is really just a special case of the distributional viewpoint, in which case your system of partitioning the universe of possible distributions revolves around the Kolmogorov criterion. (And while I made it seem like the partitioning is binary in my previous post, the principle can easily be generalized, so that's not a problem). And we may even be able to equate this statement with an alternate form based on the distributional difference between uniform and Kolmogorov.

I'm familiar with Kolmogorov, though not enough to be confident about this last hypothesis - I'll have to think about it some more.

MathProgramming · on April 23, 2012

I'm not super familiar with the various theories of probability, but if you're interested in thinking of these kinds of things, you might enjoy reading about the Universal Distribution, which is as far as I know defined in terms of Kolmogorov complexity. http://www.scholarpedia.org/article/Algorithmic_probability#...

I know the book Gems of Theoretical Computer Science has a good introduction to this topic as well.

ajuc · on April 23, 2012

"Random number" is just a name with a definition, it isn't any more absurd, than asking for square root of -1. In one system of definitions it doesn't make sense (real numbers), in other it does (complex numbers).

You can dismiss question as nonsensical (no, you cannot calculate square root of negative number, there is none), or you can accept new definition that allows you to say something more about a problem, and use it's results, where they are usefull. And Kolmogorov complexity is an usefull definition, maybe not as much as complex numbers, but still.

chimeracoder · on April 23, 2012

Actually, read OP's response to my comment (and the following discussion) - my comment doesn't actually invalidate Kolmogorov; it really just frames where Kolmogorov fits within the context of the question. That is, the original question is in fact nonsensical, but Kolmogorov is a useful way of answering the separate-but-very-similar question using the second school of thought that I outlined.

Any definition of 'random' that I am familiar with is only precisely defined when applied to functions, not numbers. Oddly enough, this distinction is not always made clear when outlining the definition, but if you look carefully, you'll see that this is the way that the term is applied. Statisticians are notoriously sloppy when talking about terms, in the same way that computer scientists are comfortable saying that 5x +2 = O(n)... which is nonsense, because you just said that a linear function is equal to a set of functions (TypeError!). 99% of the time, this sloppiness results in no error. That said, you have to remember to be precise with the remaining 1%, because sometimes the loss in precision will lead you to a completely incorrect conclusion.

And you can invent your own definition of random, yes, the same way that you can invent your own number system for numbers like 5/0. But then you have to rebuild the fundamental relationships from scratch (you need to prove that addition works in this new system the way it does for real numbers: 5/0 + 6/0 may not equal 11/0 in this new system, for example).

ajuc · on April 23, 2012

I'm not that into statistic, but I remember the problems with conflicting views on what "random" means, when I've been writing thesis about PRNGs.

On one hand probability theory says number cannot be random, on the other hand we want to be able to compare randomness of strings from PRNGs to say which is better. Probability theory says PRNGs are not random, 00000000 is no more or less random than 10011010 and that's the end of discussion. But Kolmogorov complexity allows us to at least define, what it means for a string to be random. Probability theory only allows us to compute probability that given string was taken from random distribution, but we already know that PRNGs are not random, so it feels a little artifical to use probability theory there.

That's why Kolmogorov complexity is useful, more so than 5/0 numbers :) Randomness for a string/number wasn't defined before, so there is no conflict, so I don't see why are you insisting that it's nonsensical to speak about random and not random numbers. 2 definitions for 2 different mathemathical objects. Polimorphism :)

But I'm not mathemathician, and I probably forgot many of the things I should know to discuss with you, so I'm open for arguments.

eerbaro · on April 23, 2012

I don't understand why you said that random numbers are non-sense, what about Chaitin's Omega number? The approach here is incompresibility. You don't even need probability to define randomness, just Turing Machines. What exactly means that randomness implies probability in math terms? Have you read Kolmogorov axioms on probability? One of its success -not failure- is that it doesn't need a definition of randomness to build its theory.

chimeracoder · on April 23, 2012

> You don't even need probability to define randomness, just Turing Machines.

You don't need the integral of 1/x with respect to x to define e, but you can do it that way. Mathematical definitions are bidirectional in many cases, in that we can use A to define B or B to define A without any loss in power for defining C in terms of B and/or A. But that doesn't mean that this works for any inversion of the definitions, as in the case I outlined above.

> I don't understand why you said that random numbers are non-sense, what about Chaitin's Omega number?

As you can see, the definitions we use to construct these make all the difference! The definition of the Chaitin constant that I'm familiar with is a probability, and the probability is not random; rather, we assign a probability to a random event (or, more precisely, the outcome of a random function). If * probabilities* were themselves random, they wouldn't be very useful, would they!

> Have you read Kolmogorov axioms on probability? One of its success -not failure- is that it doesn't need a definition of randomness to build its theory

I think you misunderstood my point, which is pretty much orthogonal to Kolmogorov. I didn't say that probability requires an assumption of randomness; I said that randomness (as used by the author in this post) implicitly invokes a notion of probability ('likelihood', in the casual use of the word). And certainly as used in the 'Is 14 a random number' example.

dsr_ · on April 23, 2012

You can also think of randomness as a measure of predictability: given our knowledge prior to being presented with this number, how well could we predict this number?

chimeracoder · on April 23, 2012

You're thinking like a Bayesian - I avoided using those terms, but that's essentially the latter approach. The prior knowledge is built into the model with our choice of the functions and the weights that we assign those functions.

(It should be noted that the Bayesian approach doesn't guarantee a defined answer - we can easily have a divide-by-zero/undefined value in that school of thought as well. However, the frequentist approach always leads to an undefined outcome in this question, because it follows from the definition of probability itself, whereas in the Bayesian approach, it follows from insufficient information in the model specification.)

andrewcooke · on April 23, 2012

if people find that interesting, they might enjoy this book - http://mitpress.mit.edu/books/FLAOH/cbnhtml/

http://www.amazon.com/The-Computational-Beauty-Nature-Explor...

dsr_ · on April 23, 2012

Mentioning the pigeonhole principle by name might have been useful. It almost got defined here...

archgoon · on April 23, 2012

Well, I guess it basically goes to show how hard it is to determine the actual kolmogorov complexity (even once specifying a particular encoding of a Turing Machine) of a string, but the following is a new upper bound on the complexity (in Python) of the second string.

print '00011101001000101101001000101111010100000100111101'

print bin(128141569638717)[2:]

In general, any string of 1s and 0s will be compressible using this method in python once the binary number is greater than 212 (you break even at 211). However, again, I only claim this to be an upperbound ;). (also assuming that invoking built-ins isn't cheating).

MathProgramming · on April 23, 2012

In the context of Turing machines, the output is really a _binary_ string, not a string of arbitrary characters. So in a sense, those two programs are the same thing. But of course, the Kolmogorov complexity of any number n is bounded by log(n) + c.

mmphosis · on April 23, 2012

print "000"+bin(0x748b48bd413d)[2:]

archgoon · on April 23, 2012

ah... good catch :)

strictfp · on April 23, 2012

With the given definition I assume that it would be impossible to have a Kolmogorov complexity higher than the size of the smallest PRNG program (+seed)? If so it might be of limited usefulness for larger strings.

ajuc · on April 23, 2012

PRNGs are not interchangeable, they produce different subsets of all possible strings (let's define result of PRNG+seed as all the bits of generated numbers concatenated, until PRNG starts to repeat).

And any program returning something can be seen as a PRNG (at worst it won't accept seed, so it will only produce one string).

strictfp · on April 24, 2012

True. What you say must imply that no PRNG-generated sequence has a particularly high K complexity (> len(PRNG)). What still interests me in this scenario is that a true random source can still generate the same sequence as a PRNG. In that case the complexity of the string becomes small. So for a given true random source you can still get strings with low K complexity. Does this affect cryptography in any way?

ajuc · on April 24, 2012

No, because cryptography just wants to ensure that given all past bits of generator output you can't predict future bit with probability different than 0.5.

And besides, overwhelming majority of strings are random, once you got to strings long enough. Probabilty that random generator will make not Kolmogorov random string is 0.

eevilspock · on April 23, 2012

I'm no expert, but information theory answers the question of "randomness" more simply, intuitively and rigorously.

This notion of Kolmogorov Complexity doesn't seem to add any additional value.