62 



THE BELL SYSTEM TECHNICAL JOURNAL, JANUARY 1951 



Kq^ — Qi+i)' The entropy of the distribution is log i. The total entropy is 

 then 



27 



Jliiq^ - q^+i) log i. 



The problem, then, is to show that any system of probabilities piii , ... , 

 is), with best prediction frequencies qi has an entropy Fn greater than or 

 equal to that of this rectangular system, derived from the same set of qi . 



Fig. 3 — Rectangular decomposition of a monotonic distribution. 



The qi as we have said are obtained from the p{ii , . . - , iN) by arranging 

 each row of the table in decreasing order of magnitude and adding vertically. 

 Thus the qi are the sum of a set of monotonic decreasing distributions. Re- 

 place each of these distributions by its rectangular decomposition. Each one 

 is replaced then (in general) by 27 rectangular distributions; the qi are the 

 sum of 27 X 27^ rectangular distributions, of from 1 to 27 elements, and all 

 starting at the left column. The entropy for this set is less than or equal to 

 that of the original set of distributions since a termwise addition of two or 

 more distributions always increases entropy. This is actually an application 



