# Turing machine

Paul M.B. Vitanyi (2009), Scholarpedia, 4(3):6240. | doi:10.4249/scholarpedia.6240 | revision #91897 [link to/cite this article] |

A **Turing machine** refers to a hypothetical machine proposed by Alan M. Turing (1912--1954) in 1936 whose computations are intended to give an operational and formal definition of the intuitive notion of computability in the discrete domain. It is a digital device and sufficiently simple to be amenable to theoretical analysis and sufficiently powerful to embrace everything in the discrete domain that is intuitively computable. As if that were not enough, in the theory of computation many major complexity classes can be easily characterized by an appropriately restricted Turing machine; notably the important classes P and NP and consequently the major question whether P equals NP.

Turing gave a brilliant demonstration that everything
that can be reasonably said to be computed by a human computer using
a fixed procedure can be computed by such a machine.
As Turing claimed,
any process that can be naturally called an effective procedure
is realized by a Turing machine.
This is known as ** Turing's thesis**. Enter Alonzo Church (1903--1995).
Over the years, all serious attempts to give precise yet intuitively satisfactory definitions of a notion of effective procedure (what Church called
effectively calculable function) in the widest possible sense have turned
out to be equivalent---to define essentially the same class of processes. In
his original paper, Turing established the equivalence of his notion of effective procedure with his automatic machine (a-machine) now called a Turing
machine. Turing then showed the formal equivalence of Turing machines
with λ-definable functions, the formalism in which Church and Kleene first
worked from 1931–1934, and the formalism in which Church first stated his
thesis in 1934 privately and informally to Gödel.
The ** Church-Turing thesis**
states that a function on the positive
integers is effectively calculable if and only if it is computable. An informal accumulation of the tradition in S. C. Kleene (1952) has transformed it to
the ** Computability thesis**: there is an objective notion
of effective computability
independent of a particular formalization.
The informal arguments Turing sets forth in his 1936 paper
are as lucid and convincing now as they were then.
To us it seems that it is the best introduction to the subject, and
we refer the reader to this superior piece of expository writing.

## Contents |

## Formal definition of Turing machine

We formalize Turing's description as follows:
A
* Turing machine*
consists of a finite program,
called the

*, capable of manipulating a linear list of*

**finite control***, called the*

**cells***, using one access pointer, called the*

**tape***. We refer to the two directions on the tape as `right' and `left.' The finite control can be in any one of a finite set of states \(Q\ ,\) and each tape cell can contain a 0, a 1, or a `blank' \(B\ .\) Time is discrete and the time instants are ordered \(0,1,2, \ldots ,\) with 0 the time at which the machine starts its computation. At any time, the head is positioned over a particular cell, which it is said to `scan.' At time 0 the head is situated on a distinguished cell on the tape called the `start cell,' and the finite control is in a distinguished state \(q_0\ .\) At time 0 all cells contain \(B\)'s, except for a contiguous finite sequence of cells, extending from the start cell to the right, which contain 0's and 1's. This binary sequence is called the `input.' The device can perform the following basic operations:*

**head**- it can write an element from \(A= \{ 0,1,B \}\) in the cell it scans; and
- it can shift the head one cell left or right.

When the device is active it executes these operations at the rate of one operation per time unit (a `step'). At the conclusion of each step, the finite control takes on a state from \(Q\ .\) The device is constructed so that it behaves according to a finite list of rules. These rules determine, from the current state of the finite control and the symbol contained in the cell under scan, the operation to be performed next and the state to enter at the end of the next operation execution.

The rules have format \((p,s,a,q)\ :\) \(p\) is the
current state of the finite control; \(s\) is the symbol
under scan; \(a\) is the next operation to be executed
of type (1) or (2)
designated in the obvious sense by an element from
\(S = \{ 0,1,B,L,R \}\ ;\) and \(q\) is the state of the finite control
to be entered at the end of this step.

For now, we assume that
there are no two distinct quadruples that have their first
two elements identical, the device is
* deterministic*.
Not every possible combination of the first two elements has to be
in the set; in this way we permit the device
to perform `no' operation. In this case we say that the device

*. Hence, we can define a Turing machine by a mapping from a finite subset of \(Q \times A\) into \(S \times Q\ .\) Given a Turing machine and an input, the Turing machine carries out a uniquely determined succession of operations, which may or may not terminate in a finite number of steps.*

**halts**Strings and natural numbers are occasionally identified according to the pairing
\[
(\epsilon,0), (0,1), (1,2), (00,3), (01,4), (10,5), (11,6), \ldots ,
\]
where \(\epsilon\) denotes the empty string (with no bits).
In the following we need the notion of a 'self-delimiting' code of a binary string. If \(x=x_1 \ldots x_n\) is a string of
\(n\) bits, then its * self-delimiting code* is \(\bar{x}=1^n0x\ .\) Clearly, the length \(|\bar{x}| = 2|x|+1\ .\) Encoding a binary string self-delimitingly enables a machine to determine where the string ends reading it from left to right in a single pass and without reading past the last bit of the code.

### Computable functions

We can associate a partial function with each Turing
machine in the following way: The input to the Turing
machine is presented as an \(n\)-tuple \((x_1 , \ldots , x_n )\)
consisting of self-delimiting versions of the \(x_i\)'s.
The integer represented by the
maximal binary string (bordered by blanks) of which some bit is scanned,
or 0 if a blank is scanned,
by the time the machine halts,
is called the `output' of the computation.
Under this convention for inputs and outputs,
each Turing machine defines a partial function
from \(n\)-tuples of integers onto the integers, \(n \geq 1\ .\)
We call such a function
**partial computable**.
If
the Turing machine halts for all inputs,
then the function computed is defined
for all arguments and
we call it
**total computable**. (Instead of `computable' the more ambiguous
`recursive' has also been used.)
We call a function
with range \( \{ 0,1 \} \) a
`predicate',
with the interpretation that the predicate of an
\(n\)-tuple of values is `true' if the corresponding
function assumes value 1 for that \(n\)-tuple of
values for its arguments and is `false' or `undefined' otherwise.
Hence, we can talk about `partial
(total) computable predicates'.

### Examples of computable functions

Consider \(x\) as a binary string. It is easy to see that the functions \(|x|\ ,\) \(f(x)= \bar x\ ,\) \(g( \bar x y)=x\ ,\) and \(h( \bar x y)=y\) are partial computable. Functions \(g\) and \(h\) are not total since the value for input \(1111\) is not defined. The function \(g'( \bar x y)\) defined as 1 if \(x = y\) and as 0 if \(x \neq y\) is a computable predicate. Consider \(x\) as an integer. The following functions are basic \(n\)-place total computable functions: the `successor' function \(\gamma^{(1)} (x) = x+1\ ,\) the `zero' function \(\zeta^{(n)} (x_1 , \ldots , x_n ) = 0\ ,\) and the `projection' function \(\pi_m^{(n)} (x_1 , \ldots , x_n ) = x_m\) (\(1 \leq m \leq n\)). The function \(\langle x, y\rangle = \bar x y\) is a total computable one-to-one mapping from pairs of natural numbers into the natural numbers. We can easily extend this scheme to obtain a total computable one-to-one mapping from \(k\)-tuples of integers into the integers, for each fixed \(k\ .\) Define \(\langle n_1 , n_2 , \ldots ,n_k \rangle \) \(= \langle n_1 , \langle n_2 , \ldots , n_k \rangle \rangle\ .\) Another total recursive one-to-one mapping from \(k\)-tuples of integers into the integers is \(\langle n_1 , n_2 , \ldots ,n_k \rangle = \bar n_1 \ldots \bar n_{k-1} \bar n_k\ .\)

### Computability thesis

The class of algorithmically computable numerical functions (in the intuitive sense) coincides with the class of partial computable functions. Originally intended as a proposal to henceforth supply intuitive terms such as `computable' and `effective procedure' with a precise meaning, the Computability thesis has come into use as shorthand for a claim that from a given description of a procedure in terms of an informal set of instructions we can derive a formal one in terms of Turing machines.

## Universal Turing machine

It is possible to give an effective (computable) one-to-one pairing
between natural numbers and Turing machines. This is
called an
* effective enumeration*.
One way
to do this is to encode the table of rules of
each Turing machine in binary, in a canonical way.

The only thing we have to do for every Turing machine is to encode the defining mapping \(T\) from \(Q \times A\) into \(S \times Q\ .\) Giving each element of \(Q \bigcup S\) a unique binary code requires \(s\) bits for each such element, with \(s = \lceil \log (|Q|+5) \rceil\ .\) Denote the encoding function by \(e\ .\) Then the quadruple \((p,0,B,q)\) is encoded as \(e(p)e(0)e(B)e(q)\ .\) If the number of rules is \(r\ ,\) then \(r \leq 3|Q|\ .\) We agree to consider the state of the first rule as the start state. The entire list of quadruples, \[ T = ( p_1 ,t_1 ,s_1, q_1 ) , (p_2 , t_2 ,s_2 , q_2 ), \ldots , (p_r , t_r , s_r , q_r ), \] is encoded as \[ E(T) = \bar s \bar r e( p_1 ) e( t_1 ) e( s_1 ) e( q_1 ) \ldots e(p_r ) e( t_r ) e( s_r ) e ( q_r ) . \] Note that \(l(E(T)) \leq 4rs + 2 \log rs + 4\ .\) (Moreover, \(E\) is self-delimiting, which is convenient in situations in which we want to recognize the substring \(E(T)\) as prefix of a larger string.)

We order the resulting binary strings lexicographically (according to increasing length). We assign an index, or Gödel number, \(n(T)\) to each Turing machine \(T\) by defining \(n(T)=i\) if \(E(T)\) is the \(i\)th element in the lexicographic order of Turing machine codes. This yields a sequence of Turing machines \(T_1 ,T_2 , \ldots \) that constitutes the effective enumeration. One can construct a Turing machine to decide whether a given binary string \(x\) encodes a Turing machine, by checking whether it can be decoded according to the scheme above, that the tuple elements belong to \(Q \times A \times S \times Q\ ,\) followed by a check whether any two different rules start with the same two elements. This observation enables us to construct `universal' Turing machines.

A
* universal*
Turing machine \(U\) is a Turing machine that
can imitate the behavior of any other Turing machine
\(T\ .\) It is a fundamental result that such machines exist
and can be constructed effectively.
Only a suitable description of \(T\)'s finite
program and input
needs to be entered on \(U\)'s tape initially.
To execute the consecutive actions that \(T\) would
perform on its
own
tape,
\(U\) uses \(T\)'s description
to simulate \(T\)'s actions
on a representation of \(T\)'s tape contents.
Such a machine \(U\) is also called `computation universal.'
In fact, there are infinitely many such \(U\)'s.

We focus on a universal Turing machine \(U\) that uses the encoding above. It is not difficult, but tedious, to define a Turing machine in quadruple format that expects inputs of the format \(E(T)p\) and is undefined for inputs not of that form. The machine \(U\) starts to execute the successive operations of \(T\) using \(p\) as input and the description \(E(T)\) of \(T\) it has found so that \(U(E(T)p)=T(p)\) for every \(T\) and \(p\ .\) We omit the explicit construction of \(U\ .\)

For the contemporary reader there should be nothing mysterious in the concept of a general-purpose computer which can perform any computation when supplied with an appropriate program. The surprising thing is that a general-purpose computer can be very simple: Marvin Minsky has been shown that four tape symbols and seven states suffice easily in the above scheme. This machine can be changed to, in the sense of being simulated by, our format using tape symbols \( \{ 0, 1, B \}\) at the cost of an increase in the number of states. The last reference contains an excellent discussion of Turing machines, their computations, and related machines. The effective enumeration of Turing machines \(T_1 ,T_2 , \ldots \) determines an effective enumeration of partial computable functions \(\phi_1 , \phi_2 , \ldots \) such that \(\phi_i\) is the function computed by \(T_i\ ,\) for all \(i\ .\) It is important to distinguish between a function \(\psi\) and a name for \(\psi\ .\) A name for \(\psi\) can be an algorithm that computes \(\psi\ ,\) in the form of a Turing machine \(T\ .\) It can also be a natural number \(i\) such that \(\psi\) equals \(\phi_i\) in the above list. We call \(i\) an index for \(\psi\ .\) Thus, each partial computable \(\psi\) occurs many times in the given effective enumeration, that is, it has many indices.

### Universal partial computable function

The partial computable function \(\nu^{(2)} (i, x)\) computed
by the universal Turing machine \(U\) is called the
* universal partial computable function*.
The generalization to \(n\)-place functions is straightforward.
A partial computable function \(\nu^{(n+1)} (i, x_1 , \ldots ,x_n )\)
is

*for all \(n\)-place partial computable functions if for each partial computable function \(\phi^{(n)} (x_1 , \ldots ,x_n )\) there exists an \(i\) such that the mapping \(\nu^{(n+1)}\) with the first argument fixed to \(i\) is identical to the mapping \(\phi^{(n)}\ .\) Here \(i\) is an index of \(\phi^{(n)}\) with respect to \(\nu^{(n+1)}\ .\) For each \(n\ ,\) we fix a partial computable \((n+1)\)-place function that is universal for all \(n\)-place partial computable functions.*

**universal****Enumeration theorem
for \(n\)-place partial
computable functions.** Here \(z\) is the index of the universal
function.
For each \(n\) there exists an index \(z\) such that for all \(i\) and
\( x_1 , \ldots ,x_n\ ,\)
if
\(\phi_i^{(n)} (x_1 , \ldots ,x_n )\) is defined, then
\(\phi_z^{(n+1)} (i, x_1 , \ldots ,x_n ) = \)
\(\phi_i^{(n)} (x_1 , \ldots ,x_n )\ ,\)
and \(\phi_z^{(n+1)} (i, x_1 , \ldots ,x_n )\) is undefined otherwise.
(\(\phi_z^{(n+1)}\) is a universal partial computable function
that enumerates the partial computable functions of \(n\)
variables.)

### Computably enumerable sets

A set \(A\) is
* computably enumerable*
if it is empty or the range
of some total computable function \(f\ .\)
We say that \(f\)
`enumerates'
\(A\ .\)
The intuition behind this definition is that there
is a Turing machine for listing
the elements of \(A\) in some arbitrary order with
repetitions allowed. An equivalent definition is that \(A\) is computably
enumerable if it is accepted by a Turing machine. That is,
for each element in \(A\ ,\) the Turing machine halts
in a distinguished accepting state, and for each element not in \(A\)
the machine either halts in a nonaccepting state or computes forever.

A set \(A\) is
* computable*
if it possesses a computable
characteristic function.
That is, \(A\) is computable iff
there exists a computable function \(f\) such that
for all \(x\ ,\) if \(x \in A\ ,\) then \(f(x) = 1\ ,\) and
if \(x \in \bar A\ ,\) then \(f(x) = 0\) (\(\bar A\) is the
complement of \(A\)). An equivalent definition is that
\(A\) is computable if \(A\) is accepted by a Turing machine
that always halts.
Obviously, all computable sets are computably enumerable.

The following sets are computable: (i) the set of odd integers; (ii) the set of natural numbers; (iii) the empty set; (iv) the set of primes; (v) every finite set; (vi) every set with a finite complement. The following sets are computably enumerable: (i) every computable set; (ii) the set of indices \(i\) such that the range of \(\phi_i\) is nonempty; (iii) the set \(\{ x:\) a run of at least \(x\) consecutive 0's occurs in \(\pi \}\ ,\) where \(\pi = 3.1415 \ldots .\)

**Theorem**

- A set \(A\) is computable iff both \(A\) and its complement \(\bar A\) are computably enumerable.
- An infinite set \(A\) is computable iff it is computably enumerable in increasing order. (Here we have postulated a total order on the elements of \(A\ .\) For instance, if \(A \) (a subset of the natural numbers) with the usual order, then \(\phi\) enumerates \(A\) in increasing order if \(\phi (i) < \phi (i+1)\ ,\) for all \(i\ .\))
- Every infinite computably enumerable set contains an infinite computable subset.

The equivalent statements hold for computable and computably enumerable sets of \(n\)-tuples.

### Undecidability of the Halting Problem

Turing's paper,
and more so Kurt Gödel's paper, where such a result
first appeared, are celebrated for showing that certain
well-defined questions in the mathematical domain
cannot be settled by
**any**
effective procedure for answering questions.
The following 'machine form' of this undecidability result
is due to Turing and Church: `which machine computations
eventually terminate with a definite result, and which
machine computations go on forever without a definite conclusion?'
This is sometimes called the
* halting problem*.

Since all machines can be simulated by the universal Turing machine \(U\ ,\)
this question cannot be decided in the case of the single machine \(U\ ,\)
or more generally for
any other
individual universal machine.
The following theorem
due to Turing in 1936,
formalizes this discussion.
Let \(\phi_1 , \phi_2 , \ldots\) be the standard enumeration of
partial computable
functions and write \(\phi(x) < \infty\) if \(\phi(x)\) is defined and write \(\phi(x) = \infty\) otherwise.
Define
\(K_0 = \{ \langle x, y\rangle : \phi_x (y) < \infty \}\) as the
* halting set*.

**Theorem**.
The halting set \(K_0\) is
not computable.

The trick used in the proof is called
* diagonalization* to show that
there is no computable function \(g\) such that for all \(x,y\ ,\) we have
\(g(x,y) = 1\) if \(\phi_x (y)\) is defined,
and \(g(x,y) = 0\) otherwise.

It is easy to see that \(K_0\)
is computably enumerable. The halting set is so ubiquitous
that it merits the standard notation \(K_0\ .\)
We shall also use
the
* diagonal halting set*
\(K = \{ x: \phi_x (x) < \infty \}\ .\)
Just like \(K_0\ ,\) the diagonal halting set is computably enumerable;
and \(K\) is also not a computable set.

The theorem of Turing on the incomputability of the halting set was preceded by (and was intended as an alternative way to show) the famous (first) incompleteness theorem of Kurt Gödel in 1931. Recall that a formal theory \(T\) consists of a set of well-formed formulas, formulas for short. For convenience these formulas are taken to be finite binary strings. Invariably, the formulas are specified in such a way that an effective procedure exists that decides which strings are formulas and which strings are not.

The formulas are the objects of interest of the
theory and constitute the meaningful
statements. With each theory we associate a set
of
* true*
formulas and a set of

*formulas. The set of true formulas is `true' according to some (often nonconstructive) criterion of truth. The set of provable formulas is `provable' according to some (usually effective) syntactic notion of proof.*

**provable**A theory
\(T\) is simply any set of
formulas. A theory is
* axiomatizable*
if
it can be effectively enumerated. For instance,
its axioms (initial formulas) can be effectively enumerated and
there is an effective procedure that enumerates all proofs
for formulas in \(T\) from the axioms. A theory is

*if it is a recursive set. A theory \(T\) is*

**decidable***if not both formula \(x\) and and its negation \( \neg x\) are in \(T\ .\) A theory \(T\) is*

**consistent***if each formula \(x\) in \(T\) is true (with respect to the standard model of the natural numbers).*

**sound**Hence, soundness implies consistency.
A particularly important example of an axiomatizable theory
is * Peano arithmetic*, which
axiomatizes the standard elementary theory of
the natural numbers.

* Theorem*
There is a computably enumerable set, say the set \(K_0\) defined above,
such that for every axiomatizable
theory \(T\) that is sound and extends
Peano arithmetic,
there is a number \(n\)
such that the formula `\(n \not\in K_0\)' is true but not provable
in \(T\ .\)

In his original proof, Gödel uses diagonalization to prove the incompleteness of any sufficiently rich logical theory \(T\) with a computably enumerable axiom system, such as Peano arithmetic. By his technique he exhibits for such a theory an explicit construction of an undecidable statement \(y\) that says of itself `I am unprovable in \(T\ .\)' The formulation in terms of computable function theory is due to A. Church and S.C. Kleene.

Turing's idea was to give a formal meaning to the notion of `giving a proof.' Intuitively, a proof is a sort of computation where every step follows (and follows logically) from the previous one, starting from the input. To put everything as broad as possible, Turing analyses the notion of `computation' from an `input' to an `output' and uses this to give an alternative proof of Gödel's theorem.

### Semicomputable functions

The notion of computable functions can be extended from integer functions to real valued functions of rational arguments, and to weaker types of computability, using the framework explained above. We consider partial computable functions \(g( \langle \langle y, z \rangle,k \rangle ) = \langle p, q \rangle\) and write \(g(y/z,k) = p/q\ ,\) with \(y,z,p,q,k\) nonnegative integers. The extension to negative arguments and values is straightforward. The interpretation is that \(g\) is a rational-valued function of a rational argument and a nonnegative integer argument.

A partial function \(f\) from the rational numbers to the real numbers is
* upper semicomputable*
if it is defined by a rational-valued partial computable
function \(\phi (x,k)\) with \(x\) a rational number
and \(k\) a nonnegative integer
such that \(\phi(x,k+1) \leq \phi(x,k)\) for every \(k\) and
\(\lim_{k \rightarrow \infty} \phi (x,k)=f(x)\ .\)
This means
that \(f\) can be computably approximated from above.
A function \(f\) is

*if \(-f\) is upper semicomputable. A function is called*

**lower semicomputable***if it is either upper semicomputable or lower semicomputable or both. If a function \(f\) is both upper semicomputable and lower semicomputable on its domain, then we call \(f\)*

**semicomputable***(equivalent to computable if the domain is integer or rational). The total function versions are defined similarly.*

**computable**

Thus, a total real function \(f\) is
computable
iff
there is a total computable function \(g(x, k)\) such that
\(|f(x) - g(x, k)| < 1/k\ .\)
In this way, we extend the
notion of integer
computable functions to real-valued computable functions
with rational arguments, and to real-valued semicomputable functions
with rational arguments.
The idea is that a semicomputable function can be approximated
from one side by a computable function with rational values,
but we may never know how close we are
to the real value. A computable real function
can be approximated
to any degree of precision by a computable function with rational values.

A function \(f\) is lower semicomputable iff the set \(\{ (x, r): r \leq f(x), \; r \;\; {\rm rational} \}\) is computably enumerable. Therefore, a lower semicomputable function is `computably enumerable from below,' and an upper semicomputable function is `computably enumerable from above.'

An example of a lower semicomputable function that is not computable. Let \(K = \{ x: \phi_x (x) < \infty \}\) be the diagonal halting set. Define \(f(x) = 1\) if \(x \in K\ ,\) and \(f(x) = 0\) otherwise. We first show that \(f(x)\) is lower semicomputable. Define \(g(x, k) = 1\) if the Turing machine computing \(\phi_x\) halts in at most \(k\) steps on input \(x\ ,\) and \(g(x, k) = 0\) otherwise. Obviously, \(g\) is a rational-valued computable function. Moreover, for all \(x\) and \(k\) we have \(g(x, k + 1) \geq g(x, k)\ ,\) and \(\lim_{{k} \to \infty} g(x, k) = f(x)\ .\) Hence, \(f\) is lower semicomputable. However, if \(f(x)\) were computable, then the set \(\{ x: f(x) = 1 \} \ ,\) that is, the diagonal halting set \(K\ ,\) would be computable. But we have seen above that it is not.

Prominent examples are the Kolmogorov complexity function that is upper semicomputable but not computable (and hence not lower semicomputable), and the universal algorithmic probability function that is lower semicomputable but not computable (and hence not upper semicomputable). These are the fundamental notions in M. Li and P. Vitanyi (2008) and, among others, A. Nies (2009).

## Theory of computation

Theoretically, every intuitively computable (effectively calculable) function is computable by a personal computer or by a Turing machine. But a computation that takes \(2^n\) steps on an input of length \(n\) would not be regarded as practical or feasible. No computer would ever finish such a computation in the lifetime of the universe even with \(n\) merely \(1000\ .\) For example, if we have \(10^9\) processors each taking \(10^9\) steps/second, then we can execute \(3.1 \times 10^{25} < 2^{100}\) steps/year. Computational complexity theory tries to identify problems that are feasibly computable.

In computational complexity theory, we are often concerned with languages. A
* language* over a finite
alphabet \(\Sigma\) is simply a subset of \(\Sigma^*\ .\)
We say that a Turing machine

*a language \(L\) if it outputs 1 when the input is a member of \(L\) and outputs 0 otherwise. That is, the Turing machine computes a predicate.*

**accepts**Let \(T\) be a Turing machine.
For each input of length \(n\ ,\) if \(T\) makes at most \(t(n)\) moves
before it stops, then we say that \(T\) runs in time \(t(n)\ ,\) or has
* time complexity* \(t(n)\ .\)
If \(T\) uses at most \(s(n)\) tape cells in the above computation,
then we say that \(T\) uses \(s(n)\) space, or has

*\(s(n)\ .\)*

**space complexity**For convenience, we often give the Turing machine in the figure above a few more work tapes and designate one tape as a read-only input tape. Thus, each transition rule will be of the form \((p,{\bar s},a,q)\ ,\) where \(\bar s\) contains the scanned symbols on all the tapes, and \(p,a,q\) are as above, except that an operation now involves moving maybe more than one head.

We sometimes also make a Turing machine
* nondeterministic*
by allowing two distinct transition rules to have identical
first two components. That is, a nondeterministic Turing machine
may have different alternative moves at each step. Such a machine

*if one accepting path leads to acceptance. Turing machines are deterministic unless it is explicitly stated otherwise.*

**accepts**### Complexity classes

It is a fundamental and easy result that any
\(k\)-tape Turing machine running in
\(t(n)\) time can be simulated by a Turing machine with just one work
tape running in \(t^2(n)\) time.
(As an aside, it is more difficult to show that some multitape Turing machines require \(t^2(n)\) time when they are simulated by a one-work-tape Turing machine, which is a result by M. Li, W. Maass, and P.M.B. Vitanyi (1985, 1988). Every multitape Turing machine can be simulated by a Turing machine with two work tapes in time \(O(t(n) \log t(n))\) by a result of F.C. Hennie and R.E. Stearns (1966) and this is optimal. It is also possible to put multiple heads on the same work tape. By a result of T. Jiang, J. Seiferas and P.M.B. Vitanyi, two heads on the same work tape are more powerful than two work tapes with one head per tape.)
Every Turing machine using
\(s(n)\) space can be simulated by a Turing machine with just one work
tape using \(s(n)\) space. For each \(k\ ,\) if a language is
accepted by a \(k\)-tape Turing machine running in
time \(t(n)\) (space \(s(n)\)), then it can also be accepted
by another \(k\)-tape Turing machine running in time \(ct(n)\)
(space \(cs(n)\)), for every constant \(c>0\) (by just increasing the number of tape symbols, so this is not a `deep' theorem but a trivial one).
This leads to the following definitions
of * complexity classes*, see [ M.R. Garey and D.S. Johnson:

- \({\rm DTIME}[t(n)]\) is the set

of languages accepted by multitape deterministic Turing machines in time \(O(t(n))\ ;\)

- \({\rm NTIME}[t(n)]\) is the set of languages accepted by multitape

nondeterministic Turing machines in time \(O(t(n))\ ;\)

- \({\rm DSPACE}[s(n)]\) is the set of languages accepted by multitape

deterministic Turing machines in \(O(s(n))\) space;

- \({\rm NSPACE}[s(n)]\) is the set of languages accepted by multitape

nondeterministic Turing machines in \(O(s(n))\) space. With \(c\) running through the natural numbers:

- P is the complexity class \(\bigcup_{c } {\rm DTIME}[n^c]\ ;\)
- NP is the complexity class \(\bigcup_{c } {\rm NTIME}[n^c]\ ;\)
- PSPACE is the complexity class \( \bigcup_{c } {\rm DSPACE}[n^c]\ .\)

Languages in P, that is, languages acceptable in polynomial time, are considered feasibly computable. The nondeterministic version for PSPACE turns out to be identical to PSPACE by Savitch's Theorem (due to W.J. Savitch in 1970) which states that \({\rm NSPACE}[s(n)]={\rm DSPACE}[(s(n))^2]\ .\) The following relationships hold trivially, \(: {\rm P} \subseteq {\rm NP} \subseteq {\rm PSPACE}. \) It is one of the most fundamental open questions in computer science and mathematics to prove whether either of the above inclusions is proper. Research in computational complexity theory focuses on these questions. In order to solve these problems, one can identify the hardest problems in NP or PSPACE.

### P versus NP

A Turing machine \(T\) with an * oracle*
\(A\ ,\) where \(A\) is a language over \(T\)'s work tape
alphabet, is denoted by \(T^A\ .\)
Such a machine operates as a
normal Turing machine with the exception that after it has
computed a finite string \(x\) it can
enter a special oracle state and ask whether \(x \in A\ .\)
The machine \(T^A\) gets the correct `yes/no' answer in one step.
An oracle machine can use this feature one or more times during
each computation.

A language \(A\) is called * polynomial-time Turing-reducible* to a
language
\(B\ ,\) denoted by \(A \leq_T^P B\ ,\)
if given \(B\) as an oracle, there is a
deterministic Turing machine that
accepts \(A\) in polynomial time. That is, we can accept \(A\) in
polynomial time given answers to membership in \(B\) for free.

A language \(A\) is called * polynomial-time many-to-one reducible* to a
language
\(B\ ,\) denoted by \(A {\leq}_m^P B\ ,\) if there is a function \(r\) that is
polynomial-time computable, and for every \(a\ ,\) \(a \in A\) iff \(r(a) \in B\ .\)
In both cases, if \(B \in {\rm P}\ ,\) then so is \(A\ .\)

A language \(A\) is * NP-hard* if all languages in NP
are Turing polynomial-time (equivalently, many-to-one
polynomial-time) reducible to \(A\ .\)
Consequently, if any NP-hard language is in P, then \({\rm P = NP}\ .\)
If \(A\) is NP-hard and
\(A \in {\rm NP}\ ,\) then we say that \(A\) is

*.*

**NP-complete**"NP is the set of problems for which it is easy to show (give a certificate) that the answer is `yes,' and P is the set of `yes/no' problems for which it is easy to find the answer. The technical sense of `easy' is `doable by a deterministic Turing machine in polynomial time.' The `P versus NP' question can be understood as whether problems for which it is easy to certify the answer are the same problems for which it is easy to find the answer. The relevance is this:

Normally, we do not ask questions unless we can recognize easily in a certain sense when we have been handed the correct answer. We are not normally interested in questions for which it would take a lifetime of work to check whether you got the answer you wanted. NP is about those questions that we are likely to want answers to."

This excellent explanation was given by one of the inventors of the notions P and NP, J.E. Edmonds in an interview in FAUW Forum, University of Waterloo, January 1993.

### SAT

The most famous example of an NP-complete problem is the following.
A Boolean formula is in conjunctive normal form
if it is a conjunction of disjunctions. For example,
\[
f(x_1, x_2, x_3) = (x_1 + {\bar x}_2 + x_3)({\bar x}_2 + x_3)(x_1 + x_3 )
\]
is in conjunctive normal form, and \(x_1 x_2 + x_2 {\bar x}_3\) is
not in conjunctive normal form. A Boolean formula
\(f(x_1, \ldots , x_n)\) is * satisfiable* if
there is a Boolean-valued truth assignment \(a_1, \ldots , a_n\)
such that \(f(a_1, \ldots ,a_n)=1\ .\)
Let SAT be the set of satisfiable Boolean formulas in conjunctive
normal form. The

*is to decide whether a given Boolean formula is in SAT.*

**SAT problem**This problem was the first natural problem shown to be NP-complete. Many practical issues seem to depend on fast solutions to this problem. Given a Boolean formula, a nondeterministic Turing machine can guess a correct truth assignment, and verify it. This takes only linear time. However, if we have to deterministically search for a satisfying truth assignment, there are \(2^n\) Boolean vectors to test.

Intuitively, and as far as is known now, a deterministic Turing machine cannot do much better than simply searching through these Boolean vectors one by one, using an exponential amount of time. The Bible of this area is M.R. Garey and D.S. Johnson.

## Importance of the Turing machine

In the last three-quarter of a century the Turing machine model has proven to be of priceless value for the development of the science of data processing. All theory development reaches back to this format. The model has become so dominant that new other models that are not polynomial-time reducible to Turing machines are viewed as not realistic (the so-called * polynomial-time Computability thesis*). Without explaining terms, the `random access machine' (RAM) which is a more precise model for current computers is viewed realistic, while the `parallel random access machine' (PRAM) is not so viewed. New notions, such as randomized computations as in R. Motwani and P. Raghavan (like the fast primality tests used in Internet cryptographical protocols) are analysed using `probabilistic' Turing machines. In 1980 the Nobelist Richard Feynman proposed a `quantum computer', in effect an `analogous' version of a quantum system. Contrary to digital computers (classical, quantum, or otherwise), an analogue computer works with continuous variables and simulates the system we want to solve directly: for example, a wind tunnel with a model aircraft simulates the aeroflow and in particular nonlaminar turbulence of the aimed-for actual aircraft. In practice, analogue computers have worked only for special problems. In contrast, the digital computer, where everything is expressed in bits, has proven to be universally applicable. Feynman's innovative idea was without issue until D. Deutsch put the proposal in the form of a quantum Turing machine, that is, a

*quantum computer. This digital development exploded the area both theoretically and applied to the great area of `quantum computing'.*

**digital**## References

- D. Deutsch, Quantum theory, the Church-Turing principle and the universal quantum computer,
*Proceedings of the Royal Society of London A*, 400(1985), 97-117. - M.R. Garey and D.S. Johnson,
*Computers and Intractability: A Guide to the Theory of NP-Completeness*, W.H. Freeman, 1979. - K. Gödel, Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I,
*Monatshefte für Mathematik und Physik*, 38(1931), 173-198. - S. C. Kleene, Introduction to Metamathematics, Van Nostrand, New York, 1952.
- M. Li and P. Vitanyi,
*An Introduction to Kolmogorov Complexity and Its Applications*, Springer-Verlag, New York, Third edition, 2008. - M. Minsky,
*Computation: Finite and Infinite Machines*, Prentice-Hall, Inc. Englewood Cliffs, NJ, 1967. - A. Nies, Computability and Randomness, Oxford Univ. Press, USA, 2009.
- R. Motwani and P. Raghavan,
*Randomized Algorithms*, Cambridge Univ. Press, 1995. - A.M. Turing, On Computable Numbers, with an Application to the Entscheidungsproblem,
*Proceedings of the London Mathematical Society, 2,*42: 230-265, 1936, "Correction", 43: 544-546, 1937.

**Internal references**

- Marcus Hutter, Shane Legg, Paul M.B. Vitanyi (2007) Algorithmic probability. Scholarpedia, 2(8):2572.

- Olaf Sporns (2007) Complexity. Scholarpedia, 2(10):1623.

- James Murdock (2006) Normal forms. Scholarpedia, 1(10):1902.