EFFICIEXT CODING 



'83 



(a) 



(b) 



(c) 



Fig. 2 — Shaunon-Fano codes for three different distributions. The successive 

 bisections are indicated b}- the dashed lines and the number gives tiie step at 

 which that bisection took place. 



If the successive message segments are independent, the code will gen- 

 erate a random sequence of zeros and ones. Fig. 2b shows the code which 

 results with another distribution. Here the termination of each code 

 group is more complicated but the non-duplicative property exists so 

 the receiver can still identify the groups. Fig. 2c shows the code which 

 results when all the p(5t) are equal. It is the ordinary binary code. 



The length of each code group is equal to log l/p{Bi), for the cases 

 shown in the figures. This is true in general so long as it is possible to 

 divide the list into subgroups which are of exactly equal probability. 



When this is not possible, some code groups may be one digit longer 

 as Shannon shows. The average number of digits per message symbol 

 using this code is therefore given by 



-l/k Z piBl) log piB\) <He< -1/kj: piB\) [-1 + log p(B\)] 



G,< H,< G, + 1/k. 



For large k, He -^ Gk -^ H and the efficiency approaches unity. With 

 small k, He increases both because the smaller list of messages cannot be 

 so accurately divided repeatedly into equal probability subsets (so- 

 called "granularity" trouble), and also because more statistics are ig- 

 nored between the shorter blocks. 



The ordinary binary code provides a statistical match between mes- 

 sage source and channel only if the various message blocks Bi have 

 equal probability p{B\) = 1/2", and are mutually independent. With 

 k = \, p(Bi) = p(j) and the "blocks" are merely the successive symbols. 



