MATHEMATICAL THEORY OF COMMUNICATION 391 



ess is the same in statistical properties. Thus the letter frequencies, 

 digram frequencies, etc., obtained from particular sequences will, as the 

 lengths of the sequences increase, approach definite limits independent of 

 the particular sequence. Actually this is not true of every sequence but the 

 set for which it is false has probability zero. Roughly the ergodic property 

 means statistical homogeneity. 



All the examples of artificial languages given above are ergodic. This 

 property is related to the structure of the corresponding graph. If the graph 

 has the following two properties^ the corresponding process will be ergodic : 



1. The graph does not consist of two isolated parts A and B such that it is 

 impossible to go from junction points in part A to junction points in 

 part B along lines of the graph in the direction of arrows and also im- 

 possible to go from junctions in part B to junctions in part A. 



2. A closed series of fines in the graph with all arrows on the fines pointing 

 in the same orientation will be called a ''circuit." The ''length" of a 

 circuit is the number of lines in it. Thus in Fig. 5 the series BEBES 

 is a circuit of length 5. The second property required is that the 

 greatest common divisor of the lengths of all circuits in the graph be 

 one. 



If the first condition is satisfied but the second one violated by having the 

 greatest common divisor equal to d > 1, the sequences have a certain type 

 of periodic structure. The various sequences faU into d different classes 

 which are statistically the same apart from a shift of the origin (i.e., which 

 letter in the sequence is called letter 1). By a shift of from up to c? — 1 

 any sequence can be made statistically equivalent to any other. A simple 

 example with d = 2\s the following: There are three possible letters a, h, c. 

 Letter a is followed with either h ox c with probabilities \ and f respec- 

 tively. Either ft or c is always foUowed by letter a. Thus a typical sequence 

 is 



ab a c a c a c ab ac ab ab ac a c 



This type of situation is not of much importance for our work. 



If the first condition is violated the graph may be separated into a set of 

 subgraphs each of which satisfies the first condition. We will assume that 

 the second condition is also satisfied for each subgraph. We have in this 

 case what may be called a "mixed" source made up of a number of pure 

 components. The components correspond to the various subgraphs. 

 If Li , L2 , L3 , • • • are the component sources we may write 



L = piLi + P2L2 + psLz + • • • 



where pi is the probability of the component source Li . 



^ These are restatements in terms of the graph of conditions given in Frechet. 



