392 BELL SYSTEM TECHNICAL JOURNAL 



Physically the situation represented is this: There are several different 

 sources Li , L2 , Lz , ■ • • which are each of homogeneous statistical structure 

 (i.e., they are ergodic). We do not know a priori which is to be used, but 

 once the sequence starts in a given pure component Li it continues indefi- 

 nitely according to the statistical structure of that component. 



As an example one may take two of the processes defined above and 

 assume pi = .2 and p2 = .8. A sequence from the mixed source 



L = .2 Zi + .8 L2 



would be obtained by choosing first Zi or L2 with probabilities .2 and .8 

 and after this choice generating a sequence from whichever was chosen. 



Except when the contrary is stated we shall assume a source to be ergodic. 

 This assumption enables one to identify averages along a sequence with 

 averages over the ensemble of possible sequences (the probability of a dis- 

 crepancy being zero). For example the relative frequency of the letter A 

 in a particular infinite sequence will be, with probabihty one, equal to its 

 relative frequency in the ensemble of sequences. 



If Pi is the probability of state i and pi{j) the transition probability to 

 state j, then for the process to be stationary it is clear that the Pi must 

 satisfy equilibrium conditions: 



Pj = IlPipi(j)- 



i 



In the ergodic case it can be shown that with any starting conditions the 

 probabilities Pj(N) of being in state y after N symbols, approach the equi- 

 librium values as TV -^ oc . 



6. Choice, Uncertainty and Entropy 



We have represented a discrete information source as a Markoff process. 

 Can we define a quantity which will measure, in some sense, how much in- 

 formation is ''produced" by such a process, or better, at what rate informa- 

 tion is produced? 



Suppose we have a set of possible events whose probabilities of occurrence 

 are />i , />2 , • • • , pn - These probabilities are known but that is all we know 

 concerning which event will occur. Can we find a measure of how much 

 ''choice" is involved in the selection of the event or of how uncertain we are 

 of the outcome? 



If there is such a measure, say H{pi , p2 , - - ■ , pn), it is reasonable to re- 

 quire of it the following properties: 



1. H should be continuous in the pi . 



2. If all the />, are equal, pi = - , then H should be a monotonic increasing 



n 



