MATHEMATICAL THEORY OF COMMUNICATION 



411 



These results are the main justification for the definition of C and will 

 now be proved. 



Theorem 11. Let a discrete channel have the capacity C and a discrete 

 source the entropy per second H. li H < C there exists a coding system 

 such that the output of the source can be transmitted over the channel with 

 an arbitrarily small frequency of errors (or an arbitrarily small equivocation). 

 li H > C it is possible to encode the source so that the equivocation is less 

 than H — C -\- e where e is arbitrarily small. There is no method of encod- 

 ing which gives an equivocation less than H — C. 



The method of proving the first part of this theorem is not by exhibiting 

 a coding method having the desired properties, but by showing that such a 

 code must exist in a certain group of codes. In fact we will average the 

 frequency of errors over this group and show that this average can be made 

 less than e. If the average of a set of numbers is less than e there must 

 exist at least one in the set which is less than e. This will establish the 

 desired result. 



H(X) 



Fig. 9 — The equivocation possible for a given input entropy to a channel. 



The capacity C of a noisy channel has been defined as 

 C = Max {H{x) - Hy(x)) 



where x is the input and y the output. The maximization is over all sources 

 which might be used as input to the channel. 



Let Sq be a source which achieves the maximum capacity C. If this 

 maximum is not actually achieved by any source let ^o be a source which 

 approximates to giving the maximum rate. Suppose 6*0 is used as input to 

 the channel. We consider the possible transmitted and received sequences 

 of a long duration T. The following will be true: 



1. The transmitted sequences fall into two classes, a high probability group 

 with about 2^^^""^ members and the remaining sequences of small total 

 probability. 



2. Similarly the received sequences have a high probability set of about 



,TH(y) 



members and a low probability set of remaining sequences. 



3. Each high probability output could be produced by about 2 

 The probability of all other cases has a small total probability. 



inputs. 



