MATHEMATICAL THEORY OF COMMUNICATION 389 



words can easily be placed in sentences without unusual or strained con- 

 structions. The particular sequence of ten words "attack on an English 

 writer that the character of this" is not at all unreasonable. It appears 

 then that a sufficiently complex stochastic process will give a satisfactory 

 representation of a discrete source. 



The first two samples were constructed by the use of a book of random 

 numbers in conjunction with (for example 2) a table of letter frequencies. 

 This method might have been continued for (3), (4), and (5), since digram, 

 trigram, and word frequency tables are available, but a simpler equivalent 

 method was used. To construct (3) for example, one opens a book at ran- 

 dom and selects a letter at random on the page. This letter is recorded. 

 The book is then opened to another page and one reads until this letter is 

 encountered. The succeeding letter is then recorded. Turning to another 

 page this second letter is searched for and the succeeding letter recorded, 

 etc. A similar process was used for (4), (5), and (6). It would be interest- 

 ing if further approximations could be constructed, but the labor involved 

 becomes enormous at the next stage. 



4. Graphical Representation of a Markoff Process 



Stochastic processes of the type described above are known mathe- 

 matically as discrete Markoff processes and have been extensively studied in 

 the literature.^ The general case can be described as follows: There exist a 

 finite number of possible "states" of a system; Si , S2 , ■ ■ • , Sn . In addi- 

 tion there is a set of transition probabilities; pi(j) the probability that if the 

 system is in state Si it will next go to state Sj . To make this Markoff 

 process into an information source we need only assume that a letter is pro- 

 duced for each transition from one state to another. The states will corre- 

 spond to the "residue of influence" from preceding letters. 



The situation can be represented graphically as shown in Figs. 3, 4 and 5. 

 The "states" are the junction points in the graph and the probabilities and 

 letters produced for a transition are given beside the corresponding line. 

 Figure 3 is for the example B in Section 2, while Fig. 4 corresponds to the 

 example C. In Fig. 3 there is only one state since successive letters are 

 independent. In Fig. 4 there are as many states as letters. If a trigram 

 example were constructed there would be at most n^ states corresponding 

 to the possible pairs of letters preceding the one being chosen. Figure 5 

 is a graph for the case of word structure in example D. Here S corresponds 

 to the "space" symbol. 



^ For a detailed treatment see M. Frechet, "Methods des fonctions arbitraires. Theorie 

 des enenements en chaine dans le cas d'un nombre fini d'etats possibles." Paris, Gauthier- 

 Villars, 1938. 



