Algorithmic randomness
Rodney G. Downey and Jan Reimann (2007), Scholarpedia, 2(10):2574. | doi:10.4249/scholarpedia.2574 | revision #90955 [link to/cite this article] |
Algorithmic randomness is the study of random individual elements in sample spaces, mostly the set of all infinite binary sequences. An algorithmically random element passes all effectively devised tests for randomness.
Overview
The theory of algorithmic randomness tries to clarify what it means for an individual element of a sample space, e.g. a sequence of coin tosses, represented as a binary string, to be random. While Kolmogorov's formalization of classical probability theory assigns probabilities to sets of outcomes and determines how to calculate with such probabilities, it does not distinguish between individual random and non-random elements. For example, under a uniform distribution, the outcome "000000000000000....0" (n zeros) has the same probability as any other outcome of n coin tosses, namely 2-n. However, there is an intuitive feeling that a sequence of all zeros is not very random. This is even more so when looking at infinite sequences. It seems desirable to clarify what we mean when we speak of a random object. The modern view of algorithmic randomness proposes three paradigms to distinguish random from non-random elements.
- Unpredictability: It should be impossible to win against a random sequence in a fair betting game when using a feasible betting strategy.
- Incompressibility: It should be impossible to feasibly compress a random sequence.
- Measure theoretical typicalness: Random sequences pass every feasible statistical test.
It is the characteristic feature of algorithmic randomness that it interprets feasible as algorithmically feasible.
Computability theoretic background
The theory of algorithmic randomness rests on the understanding of effectiveness as given by computability theory. The basic notion is that of a (partial) computable function \(f: \mathbb{N} \to \mathbb{N}\ .\) \(f\) is defined to be computable (or recursive) if it can be computed by a Turing machine. The widely accepted Church-Turing Thesis asserts that "every function which would naturally be regarded as computable can be computed by a Turing machine."
Computability can be defined on domains other than the natural numbers, too, via effective codings. For instance, we can identify natural numbers with their binary expansion and define computable functions on the set 2<ω={0,1}* of all finite binary sequences. Using a pairing function (an effective injection of \(\mathbb{N}\times\mathbb{N}\) into \(\mathbb{N}\)), one can define computability on the rational numbers \(\mathbb{Q}\ .\) The following are the most common notions of computability used in algorithmic randomness.
- A subset \(X \subseteq \mathbb{N}\) of the natural numbers is computable if its characteristic function \(\chi_X: \mathbb{N} \to \{0,1\}\) is.
- A computable real number is a real number with a computable binary expansion.
- A function \(g: \mathbb{N} \to \mathbb{R}\) is computable if there exists a computable function \(\gamma: \mathbb{N}\times \mathbb{N} \to \mathbb{Q}\) such that for all n and k,
\[ |\gamma(k,n) - g(k)| \leq 2^{-n}. \] Hence, for a real function being computable means being effectively approximable to an arbitrary degree of precision.
- A set \(X \subseteq \mathbb{N}\) is computably enumerable (c.e.) if it is the domain of a partial computable function.
- A real number α is left-c.e. if there is a uniformly computable, increasing sequence \((\gamma)_n\) of rational numbers such that \(\lim_n \gamma_n = \alpha\ .\) This is equivalent to its left-cut \(\{\gamma\in\mathbb{Q}: \gamma<\alpha\}\) being a c.e. set of rationals.
- A function \(g: \mathbb{N} \to \mathbb{R}\) is left-c.e. or enumerable from below if
\[ \{(\gamma,k) \in \mathbb{Q}\times \mathbb{N}: \gamma < g(k) \} \] is computably enumerable.
The development of algorithmic randomness in the 20th century
The systematic study of algorithmic randomness started in the early 20th century with the work of von Mises (1919). He proposed an approach that later became known as stochasticity. Aiming for a frequency interpretation of probability, he defined random sequences as members of so called Kollektivs. It was von Mises' idea that members of Kollektivs should pass all relevant statistical tests. A binary sequence is an element of a Kolletiv if
- it has a limit frequency, and
- the limit frequency is preserved for any subsequence that is selected via an admissible selection rule.
Von Mises did not clarify what should be understood as admissible rule. Only years later, with the development of computability theory, Church (1940) and others argued to interpret "admissible" as "computationally feasible". This led to the notion of Von-Mises-Wald-Church-stochasticity. A fundamental objection to von Mises concept was issued by Ville (1939). He showed that for any countable collection of selection rules there exist elements of the resulting Kollektiv that do not satisfy the law of the iterated logarithm.
Eventually, Martin-Löf (1966) overcame these difficulties by noting that every effective statistical test could be coded as an effective nullset.
Martin-Löf randomness
The set of all infinite binary sequences is called the Cantor space, denoted by 2ω. 2ω can be interpreted as the complete infinite binary tree. It bears a canonical topology generated by the cylinder sets [w], which consist of all infinite sequences extending the finite string w. By identifying real numbers with their infinite binary expansion, we can interpret elements of Cantor space as reals, too.
Lebesgue measure on Cantor space is induced by the function \[ \lambda[w] = 2^{-|w|}, \] where \(|w|\) denotes the length of \(w\ .\)
The basic paradigm of probability theory is that the typical outcome of a random experiment with distribution λ (e.g. infinite sequences of fair coin tosses) forms a set of Lebesgue measure one. However, in the classical framework it is impossible to define an individual random sequence, since every single sequence α, seen as a singleton subset of Cantor space, is a set of measure zero. It was Martin-Löf's idea to pick a countable family of nullsets, the effective tests for randomness, and define a sequence to be random if it is not contained in any of these nullsets.
Measure theoretical typicalness - Martin-Löf tests
A Martin-Löf test is a computably enumerable set \(W \subseteq \mathbb{N}\times 2^{<\omega}\) such that, with \(W_n := \{w \in 2^{<\omega}: (n,w) \in W \}\ ,\) it holds that for all n, \[ \lambda(W_n) = \sum_{w\in W_n} 2^{-|w|} \leq 2^{-n}. \] The sets \(W_n\) describe a shrinking sequence of open sets. Their intersection is a set of measure zero. \(W\) can hence be seen as a an effective nullset.
A sequence \(\alpha \in 2^{\omega}\) passes the test \(W\) if \[ \alpha \not\in \bigcap_n \bigcup_{w\in W_n} [w]. \] In other words, α is not contained in the nullset given by \(W\ .\) α is Martin-Löf random if it passes all Martin-Löf tests.
There are only countably many Martin-Löf tests, hence the union of all nullsets given by tests is a set of measure zero. Therefore, the set of all Martin-Löf random sequences has Lebesgue measure one. In particular, Martin-Löf random sequences exist.
Incompressibility - Kolmogorov complexity
An alternative approach to randomness is via incompressibility. A random sequence should have algorithmically incompressible initial segments. Unfortunately, with respect to plain Kolmogorov complexity C, such sequences do not exist (Martin-Löf, 1966). Informally speaking, the reason for this is that a string \(w\) gives \(w + |w|\) much information. Independently, Levin and Chaitin proposed a variant of plain complexity, prefix-free Kolmogorov complexity K, which circumvents this problem.
A sequence α is algorithmically random if there exists a constant c such that for all n, \[ K(\alpha\upharpoonright n) \geq n - c, \] where \(\alpha\upharpoonright n\) denotes the first n bits of α.
Schnorr showed that a sequence is Martin-Löf random if and only if it is algorithmically random. He and Levin had also characterized Martin-Löf randomness in terms of a related complexity notion called monotone complexity. Characterizations in terms of plain complexity were given by Gács (1980) and Miller and Yu (2006).
Unpredictability - effective martingales
The third approach to algorithmic randomness is based on the paradigm of an excluded feasible betting system.
A martingale in Cantor space is a function \(M: 2^{<\omega} \to \mathbb{R}^{\geq 0}\) such that for all strings \(w\ ,\) \[ 2 M(w) = M(w0) + M(w1). \] M can be thought of as a fair betting game, where M models the capital function. \(M(w)\) represents the capital after bits \(w(0)\dots w(|w|-1)\)have been revealed. The fairness of the game is reflected by the fact that the expected value of the gambler's capital after the next round, \([M(w0) + M(w1)]/2\) equals his current capital. A martingale is computably enumerable (c.e.) if it is left-c.e. as a function.
A martingale M succeeds on a sequence α, if \[ \limsup_n M(\alpha\upharpoonright n) = \infty. \]
Schnorr (1971) showed that a sequence α is Martin-Löf random if and only if there does not exist a c.e. martingale M that succeeds on α.
Chaitin's Ω and c.e. random reals
It is an intrinsic feature of algorithmic randomness that it is difficult to exhibit an individual random sequence. For instance, it easily follows from the definition of randomness that no computable sequence can be Martin-Löf random.
Chaitin (1976) defined the real number \(\Omega\ .\) It is the halting probability of a universal prefix-free Turing machine. Let \(U\) be such a machine. Let \(dom(U) = \{w : U(w)\downarrow \}\) be the set of all inputs on which \(U\) halts. Define \[ \Omega = \sum_{w \in dom(U)} 2^{-|w|}. \] Chaitin was able to show that \(\Omega\) is Martin-Löf random. (Recall that we identify real numbers with their infinite binary expansions.)
\(\Omega\) is a computably enumerable real. It is Turing complete in the sense that it is Turing equivalent to the halting problem \(\emptyset'\ .\) Hence, from a computational point of view, the bits of Ω are as difficult to decide as the question whether a universal Turing machine stops on a given input.
\(\Omega\) is not the only random real with this property, but it holds for any c.e. random real. In fact, there is a very strong homogeneity between such reals. For any c.e. real α, the following are equivalent (Kucera and Slaman, 2001, and others).
- α is Martin-Löf random.
- For all c.e. reals β there exists a c such that for all n, \(K(\beta\upharpoonright n) \leq K(\alpha\upharpoonright n) + c,\ .\)
- α is the halting probability of some universal prefix-free machine.
Relativization, lowness, and the computational power of randomness
Relative randomness
Relativization is a central notion of computability theory. If one endows a Turing machine with an oracle tape, that holds an infinite binary sequence whose bits can be read during a computation, one can define the notion of computable relative to a sequence α. Accordingly, one can relativize other concepts, like "computably enumerable".
As regards the theory of algorithmic randomness, this allows for a relativization of Martin-Löf randomness, by relativizing the effectiveness conditions on a test. Hence one can speak of Martin-Löf randomness relative to a sequence β.
Given two sequences α, β, their join α ⊕ β denotes the sequence obtained by inserting α at every even, β at every odd position.
- Van Lambalgen's Theorem: α ⊕ β is Martin-Löf random if and only if α is Martin-Löf random and β is Martin-Löf random relative to α.
Van Lambalgen's Theorem implies that if one effectively splits a random sequence into two halves, the two halves will be algorithmically independent. In particular, neither can compute the other.
Lowness
While with respect to many oracles (such as the halting problem) the set of relatively random sequences is smaller than the set of (unrelativized) Martin-Löf random sequences, there are oracles for which this is not true. Such oracles can be seen as reals of low information content - they do not help in testing for randomness. Consequently, such sequences are called low for (Martin-Löf) random. Obviously, every computable sequence is low for random. Kucera and Terwijn (1999) showed that there exist non-computable c.e. ones.
As the notion of a test, the notion of Kolmogorov complexity can be relativized, too. An alternative approach to define low information content, due to Muchnik, is to say such reals are useless in compressing. A sequence α is low for K if there exists a constant c such that for all strings \(w\ ,\) \( K^{\alpha}(w) \geq K(w) - c. \)
A third possibility to define low information content is to say such sequences have initial segment complexity as low as possible. A sequence α is K-trivial if there exists a constant c such that for all n, \( K(\alpha\upharpoonright n) \leq K(0^n) +c. \) (Solovay, 1975, Zambella, 1990, and others)
Work by Nies (2005) and others showed that all three notions coincide.
- A sequence is low for random if and only if it is low for K if and only if it is K-trivial.
The computational power of randomness
The Kucera-Terwijn construction of a c.e. low for random (and hence K-trivial) sequence yields a solution to Post's problem by providing an example of a Turing incomplete c.e. set. Downey, Hirschfeldt, Nies, and Stephan (2002) gave an easy, injury-free construction of a (non-computable) K-trivial c.e. set.
Martin-Löf random sequences, on the other hand, can be computationally very powerful. Not only are left-c.e. random sequences Turing complete, Kucera (1985) and Gacs (1986) independently showed that every sequence is Turing reducible to a Martin-Löf random sequence.
However, this situation is somewhat atypical. Given a left-c.e. random sequence, almost every other Martin-Löf random sequence is random relative to it. Furthermore, the Kucera-Gacs Theorem does no longer hold if one replaces Martin-Löf randomness by n-randomness (see definition below) for n ≥ 2. In fact, work by Miller and Nies et al. has shown that n-randomness for n ≥ 2 has weak computational power.
Other randomness concepts
A number of alternatives to Martin-Löf randomness have been studied.
Arithmetical randomness
One can make Martin-Löf tests more powerful by requiring only that the test set W is c.e. relative to some oracle. A sequence is n-random if it passes every test that is c.e. in the (n-1)th-jump, \(n \in \mathbb{N}\ ,\) of the halting problem. Hence, Martin-Löf randomness coincides with 1-randomness. There is a strict hierarchy among these randomness notions, in the sense that the set of (n+1)-random sequences is strictly contained in the set of n-random sequences.
Computable and Schnorr randomness
Schnorr (1971) criticized Martin-Löf randomness as algorithmically too loose. He proposed two alternatives, both being based on computable rather than computably enumerable test notions, as in the case of Martin-Löf randomness
- A sequence is computably random if no computable martingale succeeds on it.
- A Schnorr test is a Martin-Löf test in which the measure of every set Wn in the test is a uniformly computable real number. A sequence is Schnorr random if it passes every Schnorr test.
Kurtz randomness
Kurtz (1981) proposed to define randomness by avoiding all effectively closed sets of measure 0. A Kurtz test is an effectively closed set F (represented by the infinite paths through a computable tree) such that μ(F)=0. A sequence is Kurtz random if it is not contained in any Kurtz test.
The following strict implications hold between the different randomness concepts:
- n-randomness (n ≥ 2) \(\Rightarrow\) Martin-Löf randomness \(\Rightarrow\) computable randomness \(\Rightarrow\) Schnorr randomness \(\Rightarrow\) Kurtz randomness.
Resource-bounded randomness
Besides computability, numerous other efficiency bounds can be applied to the principle of an excluded betting game. This way one can obtain notions like DTIME(nc)-randomness or randomness for exponential time. This was introduced by Lutz and has applications in complexity theory.
Kolmogorov-Loveland randomness
As an attempt to overcome Ville's criticism of stochasticity, Kolmogorov and Loveland independently proposed to admit selection rules that select positions along a sequence non-monotonically. The same concept can be applied to martingales, if one interprets them as a betting along an infinite sequence. In an non-monotonic betting game the player can choose the next position to bet on in an arbitrary order.
- A sequence is Kolmogorov-Loveland random if no computable non-monotonic betting game succeeds on it.
Kolmogorov-Loveland randomness is stronger than computable randomness and at most as strong as Martin-Löf randomness.
Effective fractal dimensions
It is straightforward to define Martin-Löf tests, and subsequently randomness, for computable measures other than Lebesgue measure. An important family of measures are the Hausdorff measures Hs. Roughly speaking, in determining the (outer) Hs-measure of a set, the diameter of each set used in a covering is weighed by the exponent s. (Hence the measure H1 corresponds to Lebesgue measure.) The Hausdorff dimension of a set is defined as the infimum over all nonnegative s such that the set is Hs-null. By looking only at effective Hs-nullsets (for rational s), one can define an effective counterpart of Hausdorff dimension for individual sequences: \[ dim_H^1 \alpha = \inf \{ s \in \mathbb{Q}^+: \alpha \;\; is \;\; not \;\; H^s-random \}. \]
\(dim_H^1\) is also known as constructive Hausdorff dimension, cdim.
An accordant notion, effective or constructive packing dimension, \(dim_P^1\) or cDim, respectively, can be defined for packing measures and packing dimension.
The effective dimension concepts can be characterized via Kolmogorov complexity: \[ dim_H^1 \alpha = \liminf_{n \to \infty} \frac{K(\alpha\upharpoonright n)}{n} \quad \mbox{ and } \quad dim_P^1 \alpha = \limsup_{n \to \infty} \frac{K(\alpha\upharpoonright n)}{n} \] The effective Hausdorff and packing dimension of a sequence hence correspond to its lower and upper algorithmic density, respectively.
Using modified martingales called termgales, Lutz (2003) has defined the notion of dimension for individual finite strings.
References
R. Downey and D. R. Hirschfeldt. Algorithmic Randomness and Complexity. Springer, Berlin, 2010.
M. Li and P. M. B. Vitanyi An Introduction to Kolmogorov Complexity and its Applications. Springer, New York, 2nd edition, 1997, and 3rd edition 2008.
P. Martin-Löf. The definition of random sequences. Information and Control, 9:602–619, 1966.
A. Nies. Computability and randomness. Oxford University Press, 2009.
C.-P. Schnorr. Zufälligkeit und Wahrscheinlichkeit. Eine algorithmische Begründung der Wahrscheinlichkeitstheorie. Springer, Berlin, 1971.
Internal references
- Marcus Hutter (2007) Algorithmic information theory. Scholarpedia, 2(3):2519.
- Paul M.B. Vitanyi (2007) Andrey Nikolaevich Kolmogorov. Scholarpedia, 2(2):2798.
- Olaf Sporns (2007) Complexity. Scholarpedia, 2(10):1623.