730 THE BELL SYSTEM TECHNICAL JOURNAL, JULY 1952 



If the successive samples are not independent, the message source will 

 pass through a sequence of states which are determined by the past of 

 the message*. In each state there will be a set of conditional probabilities 

 describing the choice of the next symbol. If the state is i and the condi- 

 tional probability (in this state) of the next symbol being thej^^ is pi(j), 

 then the information produced by this selection is 



Hi = -Ep.(i)logp,(j). (6) 



The average rate of the source is then found by averaging (G) over all 

 states with the proper weighting; thus 



H = J: p{i)Hi = -E p(i)Z P^U) log Piij). (7) 



» i i 



The greater the correlation between successive symbols or samples of a 

 message, the more peaked the distributions Pi(j) become on the average, 

 and this results in a lower value for H. As Shannon points out, the in- 

 formation rate of a source, as given by (7), is simply the average un- 

 certainty as to the next symbol when all the past is known. But in a 

 properly operating communication channel the past of the message is 

 available at both ends, so that it should be possible to signal over the 

 channel at the rate H bits/message symbol, rather than Ho as we now 

 do. In present day communication systems we ignore the past and 

 pretend each sample is a complete surprise. 



By completely efficient statistical coding it should be possible to re- 

 duce the required channel capacity by the factor H/Ho . Whether or not 

 this improvement can be actually reached in practice depends upon the 

 amount of past required to uniquely specify the state of the message 

 source. If long range statistical influences exist, then long segments of 

 the past must be remembered. If there are m symbols in the past which 

 determine the present state and each symbol has ^ possible values, 

 there will be /" states possible (although only 2""" of these are at all prob- 

 able for large m) . If m is large the number of possible states becomes f an- 



* In a philosophical sense the state of a message source may be dependent on 

 many other factors besides the past of the message. If the source is a human being, 

 for example, the state will depend on a large number of intangibles. If these could 

 really be taken into account the resulting H for the message might be quite low. 

 If the universe is strictly deterministic one might say that H is "really" always 

 zero. When we describe the drawing of balls from the urn in terms of probabilities, 

 we admit our ignorance as to the exact detail of the mixing operation which has 

 occurred in the urn. Likewise the information rate of a source is a measure of our 

 ignorance of the exact state of the source. From a communication engineering 

 standpoint, the knowledge of the state of the source is confined to that given by 

 the past of the message. 



