708 BELL SYSTEM TECHNICAL JOURNAL 



depending only on the particular key K that was used. The value thus ob- 

 tained serves to limit the possible keys to those which would give values of 

 S in the neighborhood of that observed. A statistic which does not depend 

 on K or which varies as much with M as with K is not of value in limiting 

 K. Thus, in transposition ciphers, the frequency count of letters gives no 

 information about K — every K leaves this statistic the same. Hence one 

 can make no use of a frequency count in breaking transposition ciphers. 



More precisely one can ascribe a ''solving power" to a given statistic S. 

 For each value of S there will be a conditional equivocation of the key 

 HaiK), the equivocation when S has its particular value, and that is all 

 that is known concerning the key. The weighted mean of these values 



T.P{S) Hs{K) 



gives the mean equivocation of the key when S is known, P(S) being the 

 a priori probability of the particular value S. The key size H{K), less this 

 mean equivocation, measures the "solving power" of the statistic S. 



In a strongly ideal cipher all statistics of the cryptogram are independent 

 of the particular key used. This is the measure preserving property of 

 TjT^^ on the E space or TJ^Tk on the M space mentioned above. 



There are good and poor statistics, just as there are good and poor methods 

 of trial and error. Indeed the trial and error testing of an hypothesis is 

 is a type of statistic, and what was said above regarding the best types of 

 trials holds generally. A good statistic for solving a system must have the 

 following properties: 



1. It must be simple to measure. 



2. It must depend more on the key than on the message if it is meant to 

 solve for the key. The variation with M should not mask its variation 

 with K. 



3. The values of the statistic that can be "resolved" in spite of the 

 "fuzziness" produced by variation in M should divide the key space 

 into a number of subsets of comparable probability, with the statistic 

 specifying the one in which the correct key lies. The statistic should 

 give us sizeable information about the key, not a tiny fraction of a bit. 



4. The information it gives must be simple and usable. Thus the subsets 

 in which the statistic locates the key must be of a simple nature in the 

 key space. 



Frequency count for simple substitution is an example of a very good 

 statistic. 



Two methods (other than recourse to ideal systems) suggest themselves 

 for frustrating a statistical analysis. These we may call the methods of 

 difusion and confusion. In the method of diffusion the statistical structure 

 of M which leads to its redundancy is "dissipated" into long range sta- 



