732 THE BELL SYSTEM TECHNICAL JOURNAL, JULY 1952 



even for very complicated picture material, and n = 250,000 for a single 

 frame. 



As given by (11), p, also represents the fraction of the possible signals 

 on a channel of / levels which are likely ever to be used by messages of 

 length n without statistical encoding. 



STATISTICALLY MATCHED CODES 



Since a sequence of binary digits can be remapped by a non-statistical 

 process into a channel with b quantizing levels, or indeed into a wide 

 variety of other signalling alphabets, it suffices to consider statistical 

 coding processes and codes which reduce the message to a sequence of 

 binary digits. An efficient code is then one for which the average number 

 of binary digits. He , per message symbol lies between Ho and H. As the 

 efficiency increases H/Hc — > 1 , so this ratio may be taken as an efficiency 

 index. With highly efficient processes, the sequences of binary digits 

 produced will have little residual correlation, i.e., they will be nearly 

 random sequences. Since the encoding process must be reversible the 

 receiver must be able to recognize the beginnings and ends of code groups. 

 Since we have at our disposal only zeros and ones, the divisions between 

 code groups must either be marked by a special code group reserved for 

 this purpose, or else the code must have the property that no short code 

 group is duplicated as the beginning of a longer group. 



A code which satisfies this latter requirement and which is capable of 

 unity efficiency is the so-called Shannon-Fano code, developed inde- 

 pendently by C. E. Shannon of Bell Telephone Laboratories and R. M. 

 Fano of the Massachusetts Institute of Technology. This code is con- 

 structed as follows: One writes down all the possible message sequences 

 of length k in order of decreasing probability. This list is then divided 

 into two groups of as nearly equal probability as possible. One then 

 writes zero as the first digit of the code for all messages in the top half, 

 one as the first digit for all messages in the bottom half. Each of these 

 groups is again divided into two subsets of nearly equal probability 

 and a zero is written as the second digit if the message is in the top 

 subsets, a one if it is in the bottom. The process is continued until 

 there is only one message in each subset. Fig. 2a shows the code which 

 results when this process is applied to a particularly simple probability 

 distribution piB)) = (l/2)\ Here each code group is a series of ones 

 followed by a zero. The receiver knows a code group is finished as soon 

 as a zero appears. Although the longer groups contain mostly ones, their 

 probability is less and on the average as many zeros are sent as ones. 



