Talk:Mutual information
Explanation in Figure 1
Hello,
I believe the explanation associated with figure 1 is very misleading (just the explanation; the figure itself is a valid one to some extent).
The explanation says that decoding can be done without errors (theoretically) as each of the messages is encircled by one and only one circle. This is a wrong one. We have to look at actual decoding process: we receive some garbled word r, and if r lies inside the overlapping region between 2 circles whose centers are 2 possible messages (which is not excluded by the explanation) then there is simply no way we can identify the correct sent message. The key to decoding here is the overlapping region, not how many messages there are in one circle.
The whole miracle in Shannon's result is due to the very large codelength n (i.e. n approaches infinity). Translated into some pictorial visualization, Shannon's picture of coding problem is in n-dimensional space (this is not Euclidean space by the way). It can be showed that the volume of overlapping region of each pair of circles in (b) of figure 1 simply "vanishes" as n goes to infinity, while this is not the case in (a). This result is provable for Discrete-Memoryless channel. By "vanish," I mean the fraction of space that overlapping regions take up goes to 0; this does not mean the space of all overlapping regions is infinitesimal - it can be huge. This is the key insight in the pictorial interpretation of Shannon's theorem. Even though the overlapping regions can be vast, since the fraction of space they take up is so small, the probability that the garbled word r lands onto any overlapping regions also vanishes as n approaches infinity. Shannon's theorem and also Shannon's framework (or Shannon's model, as some coding theorists in Hamming school call) are totally based upon probability argument (and the law of large numbers in particular, to highlight the role of very large n and probability). These essences of Shannon's theorem were not touched on in the explanation.
This also brings to light why in practice there is always some error in decoding. One simply cannot provide enough resource to obtain infinite codelength n.
As a side note, a minor imperfection is that those circles in the figure were not explained.