COMMUNICATIOX TUEORV 01 SECRECY SYSTEMS 709 



tistics— i.e., into statistical structure involving long combinations of letters 

 in the cryptogram. The effect here is that the enemy must intercept a tre- 

 mendous amount of material to tie down this structure, since the structure 

 is evident only in blocks of very small individual probability. Furthermore, 

 even when he has sufficient material, the analytical work required is much 

 greater since the redundancy has been diffused over a large number of 

 individual statistics. An example of diffusion of statistics is operating on a 

 message M = Wi , nh , Mz , ■ ■ ■ with an "averaging" operation, e.g. 



s 



yn = ^ nin+i (mod 26), 



adding 5 successive letters of the message to get a letter >-„ . One can show 

 that the redundacy of the y sequence is the same as that of the m sequence, 

 but the structure has been dissipated. Thus the letter frequencies in y will 

 be more nearly equal than in m, the digram frequencies also more nearly 

 equal, etc. Indeed any reversible operation which produces one letter out for 

 each letter in and does not have an infinite "memory" has an output with 

 the same redundancy as the input. The statistics can never be eliminated 

 without compression, but they can be spread out. 



The method of confusion is to make the relation between the simple 

 statistics of E and the simple description oi K a. very complex and involved 

 one. In the case of simple substitution, it is easy to describe the limitation 

 of A' imposed by the letter frequencies of E. If the connection is very in- 

 volved and confused the enemy may still be able to evaluate a statistic 

 Si , say, which limits the key to a region of the key space. This limitation, 

 however, is to some complex region R in the space, perhaps "folded over" 

 many times, and he has a difficult time making use of it. A second statistic 

 ^2 limits K still further to R^ , hence it lies in the intersection region; but 

 this does not help much because it is so difficult to determine just what the 

 intersection is. 



To be more precise let us suppose the key space has certain "natural co- 

 ordinates" ki , kt, • ■ ■ , kp which he wishes to determine. He measures, let 

 us say, a set of statistics s^ , So , ■ ■ ■ , Sn and these are sufficient to determine 

 the ki . However, in the method of confusion, the equations connecting these 

 sets of variables are involved and complex. We have, say, 



fi{ki ,h, • • -, kp) = si 

 hih ,h, ■■■, kp) = S2 



fn(kl , ki , ■ • • , kp) = Sn , 



