# Symbolic dynamics

Post-publication activity

Curator: Susan Williams

Symbolic Dynamics is the study of shift spaces, which consist of infinite or bi-infinite sequences defined by a shift-invariant constraint on the finite-length sub-words. Mappings between two such spaces can be regarded as codes or encodings. Shift spaces are classified, up to various kinds of invertible encodings, by combinatorial, algebraic, topological and measure-theoretic invariants. The subject is intimately related to dynamical systems, ergodic theory, automata theory and information theory.

## Origins

In broad terms, an invertible dynamical system is a set $$X\ ,$$ together with an invertible mapping $$T:X \rightarrow X\ .$$ The orbit of $$x \in X$$ is the trajectory $$\{ \ldots, T^{-2}(x),T^{-1}(x), x, T(x), T^2(x), \ldots \}$$ (here, $$T^2(x) = T \circ T(x)\ ,$$ etc.). Symbolic dynamics evolved as a tool for analyzing dynamical systems by discretizing space. Imagine a point following a trajectory in a space. Partition the space into finitely many pieces, each labeled by a different symbol. A symbolic orbit is obtained by writing down the sequence of symbols corresponding to the successive partition elements visited by the point in its orbit. One can learn much about the dynamics of the system by studying its symbolic orbits.

For instance, consider the dynamical system in Figure 1. Here, $$X$$ is the unit square, and $$T$$ is a transformation of $$X\ .$$ A portion of the orbit of a point $$x \in X$$ is drawn. $$X$$ is partitioned into two cells: the left half, labelled 0', and the right half, labelled 1'. Then the point $$x$$ is represented by the bi-infinite sequence $$s(x) = \ldots s_{-2} s_{-1}.s_0 s_1 s_2\ldots$$ where $$s_n$$ is the label of the cell to which $$T^n(x)$$ belongs; the decimal point separates coordinates $$s_i, i < 0$$ from $$s_i, i \ge 0\ .$$ For $$x$$ as given in Figure 1, $$s_0=0$$ because $$x$$ belongs to the left half of the square, $$s_2 =1$$ because $$T^2(x)$$ belongs to the right half of the square, and so on: $s(x)= \ldots 1 1. 0 0 1 \ldots$ Now, if $$x$$ is represented by the sequence $$s(x)\ ,$$ then $$T(x)$$ is represented by the left coordinate shift of $$s(x)\ :$$ $s(T(x))= \ldots 1 1 0.0 1 \ldots$ Thus $$T$$ is represented symbolically' as the shift transformation on a space of sequences.

The picture above depicts part of an orbit of the Baker's transformation: $T(x,y) = \left(2x \mbox { mod } 1, \frac{y + \lfloor x/2 \rfloor}{2}\right).$ It is convenient to consider the cells as closed sets that overlap on a common boundary, so that a point that lands on the boundary has non-unique symbolic orbit; in this example, the cells overlap on a central horizontal line (so, the cells may not literally form a partition). With this understanding, all binary sequences are obtained as symbolic orbits. In general, given $$T\ ,$$ we seek a partition for which the set of symbolic orbits is constrained by some explicit rule. For example, Markov partitions for hyperbolic toral automorphisms give rise to shifts of finite type, defined below (Adler and Weiss )).

For invertible dynamical systems, as above, both forward and backward iterations are considered. Symbolic dynamics applies to non-invertible dynamical systems as well, although in this case one considers only forward iterations, and the symbolic trajectories are one-sided infinite sequences. For the mapping $$x \mapsto \lambda x \mbox{ mod }1$$ on the unit interval, the set of one-sided sequences so obtained depends heavily on $$\lambda$$ and the partition chosen. For instance, if $$\lambda =2$$ and the partition is $$\{[0,1/2], [1/2,1]\}\ ,$$ then one obtains all one-sided binary sequences; if $$\lambda$$ is the golden mean and the partition is $$\{[0,1/\lambda], [1/\lambda,1]\}\ ,$$ then one obtains all one-sided sequences that do not contain two consecutive 1's.

In this way, one uses symbolic dynamics to study dynamical systems. Properties of orbits of the original dynamical system are reflected in properties of the resulting sequences. For instance, a point whose orbit is periodic gives a periodic sequence, and the distribution of the orbit of a point $$x$$ in $$X$$ is reflected in the distribution of finite contiguous substrings of $$s(x)\ .$$ Symbolic dynamics can also be used in classifying dynamical systems; here, the problem of determining when one dynamical system is "equivalent" to another becomes, via symbolic dynamics, a coding problem.

Hadamard  is generally credited with the first use of symbolic dynamics techniques in his analysis of geodesic flows on surfaces of negative curvature. Forty years later the subject received its first systematic studies, and its name, in the foundational papers of Morse and Hedlund [1938, 1940] and Hedlund . Here for the first time symbolic systems are treated in the abstract, as objects in their own right. This abstract study was motivated both by the intrinsic mathematical interest of symbolic systems and the need to better understand them in order to apply symbolic techniques to continuous systems. Further impetus was given by the emergence of information theory and the mathematical theory of communication pioneered by Shannon , and the global theory of dynamical systems (Smale ). Symbolic dynamics has continued to find application to an ever-widening array of areas within dynamical systems (such as hyperbolic and partially hyperbolic diffeomorphisms and flows, maps of the interval, billiards, and complex dynamics) and outside of dynamical systems (such as information theory, automata theory and matrix theory).

## Full shifts and shift spaces

The (two-sided) full $$\mathcal A$$-shift over a finite alphabet $$\mathcal A$$ is the dynamical system consisting of the set of all bi-infinite symbol sequences, together with the shift map $$\sigma$$ that shifts all sequences to the left. More formally, the full shift is ${\mathcal A}^{ Z}=\{x=(x_i)_{i\in{ Z}}: x_i\in {\mathcal A} \mbox{ for all } i\in { Z} \}$

and the map $$\sigma:{\mathcal A}^{ Z}\to {\mathcal A}^{ Z}$$ is defined by $$(\sigma x)_i=x_{i+1}\ .$$ If $${\mathcal A}=\{0,1,\dots ,r-1\}\ ,$$ $${\mathcal A}^{ Z}$$ is called the full $$r$$-shift. A finite string (also called word or block) of symbols $$x_m \ldots x_n$$ is often denoted $$x_{[m,n]}\ .$$

The advantage of working with bi-infinite sequences is that the shift map is invertible. However, one may also consider the one-sided $$\mathcal A$$-shift $${\mathcal A}^{ N}\ .$$ For simplicity, concepts in this article are described in the two-sided context. Most of the definitions adapt naturally to the one-sided setting, although there are some significant differences in results.

The full shift is endowed with the product topology on $${\mathcal A}^{ Z}\ .$$ In this topology, two points $$x,y \in {\mathcal A}^{ Z}$$ are regarded as "close" if they agree on a large central block: i.e, $$x_{[-n,n]}= y_{[-n,n]}$$ for some large $$n\ .$$ The specific choice of metric is not important, so long as it reflects this notion of "closeness"". Note that the shift map $$\sigma$$ and its inverse are continuous: if $$x$$ and $$y$$ agree on their central $$2n+1$$ coordinates, then $$\sigma x$$ and $$\sigma y$$ agree at least on their central $$2n-1$$ coordinates.

A shift space (or shift or subshift ) is a closed, shift-invariant subset of a full shift. Equivalently: let $$\mathcal F$$ be any set (finite or infinite) of blocks; the set $$X = X_\mathcal F$$ of sequences that do not contain any element of $$\mathcal F$$ is a shift space, and any shift space can be expressed in this form. Below are some simple examples.

 golden mean shift
$$X$$ is the set of all binary sequences with no two consecutive 1's. Here $$X=X_{\mathcal F}\ ,$$ where $$\mathcal F=\{11\}\ .$$

 even shift
$$X$$ is the set of all binary sequences so that between any two 1's there are an even number of 0's. One can take for $$\mathcal F$$ the collection
$$\{10^{2n+1}1:n\ge0\}.$$

 prime shift
$$X$$ is the set of all binary sequences such that between any two successive 1's,
the number of 0's is prime. One can take for $$\mathcal F$$ the collection
$$\{10^n1:n \mbox{ is not prime } \}.$$


Let $$X$$ be a subset of a full shift, and let $${\mathcal B}_N(X)$$ denote the set of all blocks of length $$N$$ that occur in elements of $$X\ .$$ The language of $$X$$ is $${\mathcal B}(X)= \cup_N {\mathcal B}_N(X) \ .$$ It can be shown that the language of a shift space determines the shift space uniquely, and so one can equally well describe a shift space by specifying the "occurring" or "allowed" blocks, rather than the forbidden blocks. For example, the golden mean shift is specified by the language of blocks in which 1's are isolated. Another prominent example is:

 Morse shift
Let $$A_0 =0$$ and inductively define blocks
$$A_{n+1} = A_n \overline{A_n}\ ,$$ where $$\overline{A_n}$$ denotes bitwise
complement. The shift space whose allowed blocks are those blocks contained
in some $$A_n$$ is called the  Morse shift.


Given a shift space $$X$$ and a positive integer $$N\ ,$$ one can construct two associated shift spaces defined over the alphabet $${\mathcal B}_N(X)\ .$$ The higher power shift $$X^N$$ is the shift space over $${\mathcal B}_N(X)$$ obtained by regarding each element of $$X$$ as a sequence of

 non-overlapping  blocks of length $$N\ :$$


$X^{N} = \{ \ldots x_{[-N, -1]} x_{[0, N-1]} x_{[N, 2N-1]} \ldots :~ x \in X \}$

The higher block shift $$X^{[N]}$$ is the shift space over $${\mathcal B}_N(X)$$ obtained by regarding each element of $$X$$ as a sequence of overlapping blocks: $X^{[N]} = \{ \ldots x_{[-1, N-2]} x_{[0, N-1]} x_{[1, N]} \ldots :~ x \in X \}$

## Coding and conjugacy

Let $$x=\dots x_{-1}.x_0x_1\dots \in X\ ,$$ a shift space over $$\mathcal A \ .$$ One can transform $$x$$ into a new sequence $$y=\dots y_{-1}.y_0y_1\dots$$ over another alphabet $$\mathcal C$$ as follows. Fix integers $$m$$ and $$n$$ with $$-m\le n\ .$$ To compute the $$i$$th coordinate $$y_i$$ of the transformed sequence, use a function $$\Phi$$ that depends on the "window" of coordinates of $$x$$ from position $$i-m$$ to position $$i+n\ .$$ Here $$\Phi \colon {\mathcal B}_{m+n+1}(X)\to \mathcal C$$ is a fixed block map, from allowed $$(m+n+1)$$-blocks in $$X$$ to symbols in $$\mathcal C\ ,$$ and so $\tag{1} y_i= \Phi (x_{i-m} x_{i-m+1} \ldots x_{i+n}) = \Phi ( x_{< i-m,i+n >} ).$

The map $$\phi\colon X\to {\mathcal C}^Z$$ defined by $$y=\phi(x)\ ,$$ with $$y_i$$ given by $$\Phi$$ above, is called the sliding block code with memory $$m$$ and anticipation $$n$$ induced by $$\Phi\ .$$ This is illustrated in Figure 2, with $$m=1$$ and $$n=2\ .$$

If $$Y$$ is a shift space contained in $$\mathcal C^{Z}$$ and $$\phi(X)\subseteq Y\ ,$$ one may also write $$\phi\colon X\to Y\ .$$

Any sliding block code can be "recoded" to a 1-block code (i.e., window length is 1) by replacing the domain shift $$X$$ with its higher block shift $$X^{[n+m+1]}\ .$$

In analogy with the characterization of shift spaces as closed shift-invariant sets, sliding block codes can be characterized in a topological manner: namely, as the maps between shift spaces that are continuous and commute with the shift. This result is known as the Curtis-Hedlund-Lyndon Theorem (Hedlund ).

The shift map and its inverse are the simplest examples of sliding block codes. Below are some other simple examples.

2 to 1
Let $$\mathcal A =\{0,1\}={\mathcal C}^Z\ ,$$ $$X=\mathcal A^{'''Z'''}\ ,$$ $$m=0\ ,$$ $$n=1\ ,$$
and $$\Phi(a_0a_1)=a_0+a_1 \pmod2\ .$$ The induced sliding block code $$\phi$$ is a two-to-one map from $$X$$ to itself.

golden to even
The sliding block code induced by $$\Phi(00) =1, \Phi(01) = 0 = \Phi(10)$$ (with $$m+n=1$$) maps the golden mean shift onto the even shift.

trivial
There is a trivial sliding block code from the full 2-shift into the full 3-shift, induced by $$\Phi(0) =0, \Phi(1) = 1$$ with $$m=n=0\ .$$


If a sliding block code $$\phi\colon X\to Y$$ is onto, then it is called a factor code or factor map, and $$Y$$ is a factor of $$X\ .$$ If $$\phi$$ is one-to-one, then it is called an embedding of $$X$$ into $$Y\ .$$ If it is one-to-one and onto, then it is called a (topological) conjugacy (this is equivalent to the condition that it has an inverse which is a sliding block code). Example trivial is an embedding but not a factor code, while Examples 2 to 1 and golden to even are factor codes, but not embeddings. Any power of the shift map is a conjugacy from a shift space to itself, but a typical conjugacy is much more complicated.

Given shift spaces $$X$$ and $$Y\ ,$$ one would like to know whether $$X$$ can be factored onto $$Y\ ,$$ or embedded into $$Y\ ,$$ or is conjugate to $$Y\ .$$ These problems, in particular for SFT's and sofic shifts (introduced in the next section), have been central to symbolic dynamics.

## Shifts of finite type and sofic shifts

A shift of finite type (SFT) is a shift space that can be described by a finite list of forbidden blocks. Equivalently, an SFT can be described in terms of allowed blocks as follows: for some integer $$M\ ,$$ whenever $$u$$ is an allowed block of length at least $$M\ ,$$ $$u'$$ is the suffix of $$u$$ of length $$M$$ and $$a$$ is a symbol, then $$ua$$ is allowed if and only if $$u'a$$ is allowed. In other words, in order to tell whether a symbol can be allowably concatenated to the end of an allowed word $$u\ ,$$ one need only look at the last $$M$$ symbols of $$u\ .$$ Such an SFT is called an $$M$$-step SFT, in analogy with the "finite memory" property of $$M$$-step Markov chains. One-step SFT's are also called topological Markov chains.

The golden mean shift $$X$$ is a 1-step SFT, since it is only the last symbol of an allowed block that determines whether a given symbol can be concatenated at the end. In contrast, the even shift is not an SFT: for any $$M\ ,$$ the symbol 1 can be concatenated to the end of exactly one of the (allowed) blocks $$10^M$$ and $$10^{M+1}\ .$$ Neither the prime shift nor the Morse shift is an SFT.

Any $$M$$-step SFT $$X$$ is conjugate to a 1-step SFT. To describe a conjugacy, we take our new alphabet $$\mathcal C$$ to be the set of allowed $$M$$-blocks $$u$$ of $$X\ ,$$ and let $$\Phi$$ be the block map that takes an $$M$$-block of $$X$$ to the corresponding symbol of $${\mathcal C}\ .$$ It is not hard to see that the induced sliding block code is a one-to-one factor map, and that its image is a 1-step SFT.

A 1-step SFT can be described concretely by a finite directed graph (or simply graph), which consists of sets $${\mathcal V}$$ of vertices (or states) and $${\mathcal E}$$ of edges; each edge has an initial vertex and terminal vertex. It is usually assumed that every vertex has at least one outgoing and at least one incoming edge. This assumption results in no loss of generality for the purposes of symbolic dynamics. To represent a given SFT $$X\ ,$$ we let $${\mathcal V}$$ be the alphabet of $$X\ ;$$ for each allowed 2-block $$ab$$ of $$X$$ there is an edge with initial vertex $$a$$ and terminal vertex $$b\ .$$

For example, the golden mean shift is the set of all sequences obtained by reading off the vertex sequence in paths that traverse the graph shown in Figure 3. More generally, given any graph, the set of all bi-infinite vertex sequences obtained by following valid paths forms an SFT. Such an SFT is called a vertex shift. %Vertex shifts are really the same as the 1-step SFT's, and up to conjugacy, all SFT's can be realized as vertex shifts.

It is sometimes useful to use the edges, rather than vertices, as symbols; the sequences are sequences of edges obtained by following valid paths. Such an SFT is called an edge shift. The use of edges rather than vertices often permits the use of smaller graphs, at the expense of allowing multiple edges all with the same starting vertex and same ending vertex. If $$X$$ and $$Y$$ are the vertex and edge shifts based on a graph $$G$$ with at most one edge from any vertex to any other vertex, then the sliding block code $$\phi: X \rightarrow Y$$ defined by the map $$\Phi(IJ)=e\ ,$$ where $$e$$ is an edge from $$I$$ to $$J\ ,$$ is a conjugacy.

A shift space $$X$$ is {\bf irreducible} if whenever $$u$$ and $$w$$ are allowed blocks, there is a `connecting block $$v$$ such that $$uvw$$ is allowed. Equivalently, there is point $$x\in X$$ whose forward orbit $$\{x, \sigma(x), \sigma^2(x), \ldots, \}$$ is dense in $$X\ .$$ Irreducible SFT's are easier to understand than general SFT's, and one can study a general SFT by studying its maximal irreducible subshifts (called irreducible components).

A sofic shift is a shift space that is a factor of an SFT (Weiss ). Equivalently, given a graph $$G$$ with each edge (or vertex) labelled by a symbol from an alphabet, the set of bi-infinite label sequences of paths in $$G$$ is a sofic shift, and every sofic shift is conjugate to one presented in this way. While the even shift is not an SFT, it is a sofic shift, as presented in Figure 4. Note that here distinct edges (or vertices) are allowed to have the same label.

Sofic shifts can be regarded as "finite-state systems": in a labelled graph, vertices can be viewed as state information which connects sequences in the past with sequences in the future. SFT's are special kinds of sofic shifts; any $$M$$-step SFT can be presented by an edge-labelled graph whose states are the allowed $$M$$-blocks. A version of the pumping lemma from automata theory shows that the prime shift is not sofic (Aho, Hopcroft, and Ullman ). And it is not hard to see that the Morse shift is not sofic.

A labelled graph that presents a sofic shift is called a presentation. It is often easier to work with presentations which are right resolving, meaning that at any given state, all outgoing edges have distinct labels (as inFigure 4). In fact, every sofic shift can be presented in such a way, and every irreducible sofic shift has a unique minimal right resolving presentation. These results are closely connected to results in automata theory (Béal , Berstel and Perrin ).

SFT's and sofic shifts are useful in modeling dynamical systems and applications to data recording. However, these classes barely scratch the surface of the range of behaviours exhibited by general shift spaces. Of particular interest are minimal shift spaces, those in which every point has a dense forward orbit. This is typified by the Morse shift. Except for trivial cases, SFT's and sofic shifts are not minimal.

## Invariants of conjugacy and variants of the conjugacy problem

The conjugacy problem is the problem of determining when two given shift spaces are conjugate. An invariant of conjugacy is any property or object associated to a shift space that is preserved under conjugacy. Irreducibility is an invariant, as are the properties of being a shift of finite type or sofic shift.

The conjugacy problem for SFT's reduces to the conjugacy problem for vertex shifts (or edge shifts) since every SFT is conjugate to such a shift. Invariants for SFT's and sofic shifts are naturally described via graphs. For instance, a vertex shift or edge shift is irreducible if and only if its underlying graph is irreducible, meaning that for every ordered pair of vertices $$I$$ and $$J$$ there is a path in~$$G$$ from $$I$$ to $$J\ .$$

Given a graph $$G\ ,$$ the adjacency matrix $$A=A(G)$$ is a matrix indexed by the vertices $${\mathcal V}\ ;$$ the entry $$A_{IJ}$$ is the number of edges in $$G$$ from $$I$$ to $$J\ .$$ Since a graph and its adjacency matrix essentially determine the same information, it is customary to pass freely back and forth between a graph and its adjacency matrix when specifying a vertex shift or edge shift.

Among the most important invariants is topological entropy, defined as: $h(X)=\lim_{N\to\infty} \frac{\log|{\mathcal B}_N(X)|}{N};$ here, $$|\cdot|$$ denotes cardinality.

It should be evident that $$h(X)$$ is a measure of the "size" or "complexity" of $$X\ ,$$ as it is simply the asymptotic growth rate of the number of blocks that occur in $$X\ .$$ Roughly speaking, $$h(X)$$ is the number $$h$$ such that $$|{\mathcal B}_N(X)| \sim 2^{Nh}\ ,$$ if the $$\log$$ is taken to base 2. Topological entropy for continuous dynamical systems, in particular for shift spaces, was introduced by Adler, Konheim and McAndrew . Entropy is an invariant of conjugacy, and it cannot increase under factors and cannot decrease under embeddings.

In many cases, one can explicitly compute entropy. For the full $$r$$-shift $$X\ ,$$ $$|{\mathcal B}_N(X)|=r^N\ ,$$ and so $$h(X)=\log r\ .$$ If $$X$$ is a vertex shift or edge shift based on a graph $$G\ ,$$ $$h(X) = \log\lambda_{A(G)}\ ,$$ where $$\lambda_{A(G)}$$ is the largest eigenvalue of $$A(G)\ .$$ This result relies on Perron-Frobenius Theory (Seneta ).

From Figure 3, one sees that the golden mean shift is the vertex shift of the graph with adjacency matrix: $A = \left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right]$ A simple computation shows that $$\lambda_A$$ is the golden mean, and so the entropy of the golden mean shift is the $$\log$$ of the golden mean.

Given a right resolving presentation based on a graph $$G$$ of a sofic shift $$X\ ,$$ it can be shown that $$h(X)$$ is the same as the entropy of the vertex shift or edge shift of $$G\ .$$ This gives a means of computing the entropy of a sofic shift. From Figure 4, one sees that the entropy of the even shift is also the $$\log$$ of the golden mean.

One cannot overstate the importance of entropy as an invariant. Yet it is somewhat crude; it is perhaps not surprising that a single numerical invariant would not suffice to completely capture the many intricacies of shift spaces, even of sofic shifts or SFT's. Another invariant, which is finer in many cases, is based on periodic sequences, as follows.

For a shift space ~$$X\ ,$$ let $$p_n(X)$$ denote the number of points in $$X$$ of period $$n$$ (i.e., the number of $$x \in X$$ such that $$\sigma^n(x) = x$$). It is straightforward to show that each $$p_n(X)$$ is an invariant. Since distinct periodic sequences define distinct blocks, it follows that $\limsup_{n\to\infty}\,\frac1n\log p_n(X)\le h(X).$ The inequality can be strict. In fact, there are shift spaces with positive entropy but no periodic points at all. (An example is the Cartesian product of the Morse shift with the full 2-shift, which can be regarded as a shift with alphabet $$\{0,1\}^2\ .$$) However, for SFT's and sofic shifts, the entropy $$h(X)$$ can be recovered from the sequence $$p_n(X)\ :$$ $\limsup_{n\to\infty}\frac1n\log p_n(X)=h(X).$

The periodic point information can be conveniently combined into a single invariant, known as the zeta function. For a shift space $$X\ ,$$ $\zeta_X(t) =\exp\Bigl(\sum_{n=1}^\infty \frac{p_n(X)}n \,t^n\Bigr).$ For an edge shift or vertex shift based on a graph with adjacency matrix $$A\ ,$$ the zeta function is the reciprocal of a polynomial: $\zeta_{X}(t)=\frac1{t^r\,\chi_A(t^{-1})} =\frac1{\det(I-tA)},$ (here, $$\chi_A$$ is the characteristic polynomial) which is completely determined by the non-zero eigenvalues (with multiplicity) of $$A$$ (Bowen and Lanford ). As an example, the zeta function of the golden mean shift is $$\frac1{1 - t - t^2}\ .$$ For a sofic shift, the zeta function turns out to be a rational function, i.e., quotient of two polynomials. This can be shown by analyzing properties of a right resolving presentation of the sofic shift. From this it turns out that the zeta function of the even shift is $$\frac{1-t}{1 - t - t^2}\ .$$ The technique for computing zeta functions of sofic shifts was developed by Manning  and Bowen . So, for sofic shifts (in particular, SFT's) all of the periodic point information is determined by a finite collection of complex numbers, namely the zeros and poles of the zeta function.

Two nonnegative square integral matrices, $$A$$ and $$B\ ,$$ are said to be shift equivalent if there is a nonnegative integer $$\ell$$ and a pair $$(R,S)$$ of rectangular nonnegative integral matrices satisfying: $AR=RB, SA=BS, A^\ell=RS, B^\ell=SR.$ R. F. Williams  introduced shift equivalence and showed that the shift equivalence class is an invariant for SFT's: if two vertex shifts are conjugate then the associated adjacency matrices are shift equivalent. While this is not a complete invariant (Kim and Roush , Wagoner ), it encapsulates many very strong invariants that go well beyond entropy and the zeta function. It also provides a means of constructing conjugacies via a concrete graph-theoretic operation called state splitting (seeFigure 5, where the graph on the right is obtained by splitting states of the graph on the left). However, the conjugacy problem for SFT's and sofic shifts remains very much open.

There has been much progress on variants of the conjugacy problem. For example, SFT's $$X$$ and $$Y$$ are said to be eventually conjugate if for sufficiently large $$N$$ and all $$n \ge N\ ,$$ the higher power shifts $$X^n$$ and $$Y^n$$ are conjugate. Irreducible SFT's are completely classified by shift equivalence up to eventual conjugacy (Kim and Roush [1979, 1988]). Two SFT's $$X$$ and $$Y$$ are said to be almost conjugate if there is a third SFT $$Z\ ,$$ and factor codes from $$Z$$ to $$X$$ and from $$Z$$ to $$Y$$ that are almost one-to-one, that is, one-to-one on an open dense set. Entropy, together with a very mild condition on periodic points, completely classifies irreducible SFT's up to almost topological conjugacy (Adler and Marcus ).

There has also been much progress on the factor and embedding problems for SFT's and sofic shifts. Some of these are described in Boyle, Marcus and Trow . Of particular interest is the Embedding Theorem (Krieger ) which solved the problem for proper embeddings of irreducible SFT's and later was applied to solve a problem in matrix theory, namely characterizing the sets of nonzero eigenvalues of nonnegative real matrices (Boyle and Handelman ; see also Kim, Ormes and Roush ).

## Other directions

Symbolic dynamics is intimately tied to ergodic theory, which is the study of the dynamics of measure-preserving transformations, and in particular stationary stochastic processes (where the measure space is a space of sequences and the transformation is the shift map). For an introduction to ergodic theory, see one of the textbooks, such as Petersen , Rudolph , or Walters . The support of a stationary process is a shift space, and one can view the language of a shift space as the set of possible outcomes of such a process. In ergodic theory, there is a notion of measure-theoretic entropy, and the topological entropy of a shift space is the supremum of the measure-theoretic entropies of stationary processes with this support. For irreducible SFT's and sofic shifts, the supremum is achieved uniquely by an explicit process called the {\bf Parry measure} (this measure was introduced by Shannon, but Parry  proved uniqueness). A conjugacy between irreducible SFT's $$X$$ and $$Y$$ will preserve the Parry measure and so may be regarded as a very strong and concrete measure-theoretic isomorphism between $$X$$ and $$Y\ .$$ A vast array of notions of isomorphism in between conjugacy and measure-theoretic isomorphism continue to be studied (e.g., Parry and Tuncel , Marcus and Tuncel ).

Both measure-theoretic and topological entropy are descendants of information-theoretic entropy introduced by Shannon . Information theory studies noisy transmission channels and recording channels and seeks to find the optimal rate of reliable transmission and efficient, reliable encoding/decoding schemes (see Cover and Thomas  for an introduction to information theory). For some channels, in order to achieve reliable transmission, it is necessary to encode arbitrary messages to sequences that satisfy certain constraints, such as a bound on the number of consecutive zeros between two ones. Symbolic dynamics has played an important role in providing a framework for constructing such encoders. Adler, Coppersmith and Hassner  constructed a class of encoders using results developed for the purpose of solving the conjugacy and factor problems for SFT's and sofic shifts. This was the beginning of a rigorous theory of constrained coding (Marcus, Roth and Siegel ).

The major concepts of symbolic dynamics naturally generalize to higher dimensions, i.e. to spaces of (multi-dimensional) arrays rather than spaces of (1-dimensional) sequences. In this setting, symbolic dynamics is closely related to tiling systems and statistical mechanics. However, the problems discussed in this article are even more difficult in this setting. Even the seemingly innocuous problem of determining if a two-dimensional SFT, defined by a finite list of forbidden patterns of arrays, is nonempty turns out to be undecidable. Topological entropy has been computed for only a handful of two-dimensional SFT's. Early work on these difficult issues can be found in Berger , Kastelyn , and R.M. Robinson . Lind  gives an expository treatment of multi-dimensional symbolic dynamics and E. A. Robinson  does the same for tiling systems. One notable recent advance is the characterization of entropies of higher dimensional SFT's due to Hochman and Meyerovitch (2007).

Other important directions include one-sided shift spaces, countable state symbolic systems, orbit equivalence, flow equivalence, the automorphism group of a shift space, cellular automata, substitution systems and connections with knot theory.