MATHEMATICAL THEORY OF COMMUNICATION 397 



log /> = iV 2 Pi log pi 



i 



\ogp= -NH 



^ ^ log 1/p 



N 



H is thus approximately the logarithm of the reciprocal probability of a 

 typical long sequence divided by the number of symbols in the sequence. 

 The same result holds for any source. Stated more precisely we have (see 

 Appendix III): 



Theorem 3 : Given any e > and 6 > 0, we can find an No such that the se- 

 quences of any length N > No fall into two classes: 



1. A set whose total probability is less than e. 



2. The remainder, all of whose members have probabilities satisfying the 

 inequality 



H 



< 5 



log p~^ 

 In other words we are almost certain to have ,; very close to H when N 



N ^ 



is large. 



A closely related result deals with the number of sequences of various 

 probabilities. Consider again the sequences of length N and let them be 

 arranged in order of decreasing probability. We define n{q) to be the 

 number we must take from this set starting with the most probable one in 

 order to accumulate a total probability q for those taken. 

 Theorem 4: 



Lim^-^) = ^ 



•TV -00 N 



when q does not equal ot 1. 



We may interpret log n{q) as the number of bits required to specify the 

 sequence when we consider only the most probable sequences with a total 



probability q. Then is the number of bits per symbol for the 



specification. The theorem says that for large N this will be independent of 

 q and equal to H. The rate of growth of the logarithm of the number of 

 reasonably probable sequences is given by H, regardless of our interpreta- 

 tion of ''reasonably probable." Due to these results, which are proved in 

 appendix III, it is possible for most purposes to treat the long sequences as 

 though there were just 2"'^ of them, each with a probability 2~ ^ . 



