PREDICTION AND ENTROPY OF PRINTED ENGLISH 



57 



tables, a table of the frequencies of initial letters in words, a list of the fre- 

 quencies of common words and a dictionary. The samples in this experiment 

 were from "Jefferson the Virginian'' by Dumas Malone. These results, to- 

 gether with a similar test in which 100 letters were known to the subject, are 

 summarized in Table I. The column corresponds to the number of preceding 

 letters known to the subject plus one; the row is the number of the guess. 

 The entry in column N at row S is the number of times the subject guessed 

 the right letter at the 5th guess when (iV-1) letters were known. For example, 



Table I 



the entry 19 in column 6, row 2, means that with five letters known thfi cor 

 rect letter was obtained on the second guess nineteen times out of the hun 

 dred. The first two columns of this table were not obtained by the experi- 

 mental procedure outlined above but were calculated directly from the 

 known letter and digram frequencies. Thus with no known letters the most 

 probable symbol is the space (probabihty .182); the next guess, if this is 

 wrong, should be E (probability .107), etc. These probabilities are the 

 frequencies with which the right guess would occur at the first, second, etc., 

 trials with best prediction. Similarly, a simple calculation from the digram 

 table gives the entries in column 1 when the subject uses the table to best 



