Information Processing Theory 41 



rigor of the statement of the Principle gives the impression that 

 something has been said — something which sounds very reason- 

 able and powerful. 



The weakness of such explanations would seem not to deserve 

 such extensive comment, and yet the problem has come up so 

 frequently, particularly in psychology, that apparently it does not 

 hurt to point out such fallacies. Not all cases are so obvious, 

 unfortunately, and many theories which appear rigorous to the 

 most competent of scientists are found at times to fail on this 

 same count. I shall return to this point later. 



The amazing regularity found in the word-frequency data, 

 regularity which seems to be so hard to come by in the field of 

 human behavior, deserves more serious attempts at explanation. 

 Fortunately, other workers have attacked the same problem; and, 

 fortunately, for our purposes of today, one such approach illus- 

 trates a stochastic theory and another illustrates an application 

 of information theoretic concepts. 



The stochastic model is due to Simon (10). The approach is 

 to postulate probabilistic decision rules and from these to derive 

 the statistical properties of a device which follows the rules. The 

 challenge is to postulate rules which will yield the statistical 

 properties of the observations, in this instance the frequency 

 distribution of words. It should be pointed out that the weaker, 

 i.e., the more general, are the underlying postulates, the better 

 is the theory. Thus, as in any theory, we wish to account for as 

 much as possible with as little as necessary. 



Simon's basic model rests on only two assumptions. From these 

 he is able to derive a frequency distribution known as the Yule 

 distribution, which has all of the properties required for fitting 

 the word-frec|uency data. As a matter of fact, slight variations 

 on the assumptions yield slight differences in the resulting dis- 

 tribution. These various forms of the theory can be plausibly 

 associated with various real world situations, and the theory thus 

 accounts for several phenomena, such as the distribution of authors 

 by number of professional papers published, the distribution of 

 incomes, and the distribution of biological species by genera. 

 Furthermore, the steady state statistical properties are fairly insen- 

 sitive to minor changes in the assumptions. 



