PREDICTION AND ENTROPY OF PRINTED ENGLISH 61 



27 



also 2 9? = 12^1, (this is true here since the total probability in each 



case is 1), then the first set, q^ , is said to majoriz3 the second set, fi . It is 

 known that the majorizing property is equivalent to either of the following 

 properties: 



1. The fi can be obtained from the q^ by a finite series of ''flows." By a 

 flow is understood a transfer of probability from a larger ^ to a smaller 

 one, as heat flows from hotter to cooler bodies but not in the reverse 

 direction. 



2. The ri can be obtained from the qi by a generalized "averaging" 

 operation. There exists a set of non-negative real numbers, aij , with 

 /! dij = 22 (^ij = 1 a-nd such that 



ri = Haij{q^j). 



(16) 



► 



5. Entropy Bounds from Prediction Frequencies 



If we know the frequencies of symbols in the reduced text with the ideal 

 A-gram predictor, qi , it is possible to set both upper and lower bounds to 

 the iV-gram entropy, Fn , of the original language. These bounds are as 

 follows: 



27 27 



X) i(gf - gf+i) log i < Fn < - Jl q^ log q^. (17) 



t=i t=i 



The upper bound follows immediately from the fact that the maximum 

 possible entropy in a language with letter frequencies q^ is — ^ gf log q^ . 

 Thus the entropy per symbol of the reduced text is not greater than this. 

 The iV-gram entropy of the reduced text is equal to that for the original 

 language, as may be seen by an inspection of the definition (1) oi Fn . The 

 sums involved will contain precisely the same terms although, perhaps, in a 

 different order. This upper bound is clearly valid, whether or not the pre- 

 diction is ideal. 



The lower bound is more difficult to establish. It is necessary to show that 

 with any selection of iY-gram probabilities _^(ii , ^2 , • . . , ^jv), we will have 



27 



Zl i{qi — gf+i) log i < X pih • ' ' ^n) log pi, • • • tN-iiiN) (18) 



The left-hand member of the inequality can be interpreted as follows: 

 Imagine the qi arranged as a sequence of lines of decreasing height (Fig. 3). 

 The actual qi can be considered as the sum of a set of rectangular distribu- 

 tions as shown. The left member of (18) is the entropy of this set of distribu- 

 tions. Thus, the i*^ rectangular distribution has a total probability of 



