MATHEMATICAL THEORY OF COMMUNICATION 385 



of the channel, by the use of proper encoding of the information. In teleg- 

 raphy, for example, the messages to be transmitted consist of sequences 

 of letters. These sequences, however, are not completely random. In 

 general, they form sentences and have the statistical structure of, say, Eng- 

 lish. The letter E occurs more frequently than Q, the sequence TH more 

 frequently than XP, etc. The existence of this structure allows one to 

 make a saving in time (or channel capacity) by properly encoding the mes- 

 sage sequences into signal sequences. This is already done to a limited ex- 

 tent in telegraphy by using the shortest channel symbol, a dot, for the most 

 common English letter E; while the infrequent letters, Q, X, Z are repre- 

 sented by longer sequences of dots and dashes. This idea is carried still 

 further in certain commercial codes where common words and phrases are 

 represented by four- or five-letter code groups with a considerable saving in 

 average time. The standardized greeting and anniversary telegrams now 

 in use extend this to the point of encoding a sentence or two into a relatively 

 short sequence of numbers. 



We can think of a discrete source as generating the message, symbol by 

 symbol. It will choose successive symbols according to certain probabilities 

 depending, in general, on preceding choices as well as the particular symbols 

 in question. A physical system, or a mathematical model of a system which 

 produces such a sequence of symbols governed by a set of probabilities is 

 known as a stochastic process.^ We may consider a discrete source, there- 

 fore, to be represented by a stochastic process. Conversely, any stochastic 

 process which produces a discrete sequence of symbols chosen from a finite 

 set may be considered a discrete source. This will include such cases as: 



1. Natural written languages such as English, German, Chinese. 



2. Continuous information sources that have been rendered discrete by some 

 quantizing process. For example, the quantized speech from a PCM 

 transmitter, or a quantized television signal. 



3. Mathematical cases where we merely define abstractly a stochastic 

 process w^hich generates a sequence of symbols. The following are ex- 

 amples of this last type of source. 



(A) Suppose we have five letters A, B, C, D, E which are chosen each 

 with probability .2, successive choices being independent. This 

 w^ould lead to a sequence of which the following is a typical example. 

 BDCBCECCCADCBDDAAECEEA 

 ABBDAEECACEEBAEECBCEAD 

 This was constructed with the use of a table of random numbers.^ 



3 See, for example, S. Chandrasekhar, "Stochastic Problems in Physics and Astronomy," 

 Ra4eu'S of Modern Physics, v. 15, No. 1, January 1943, p. 1. 



* Kendall and Smith, "Tables of Random Sampling Numbers," Cambridge, 1939. 



