185 



But that is only for one class-range of D, 51-25 to 51-50, 

 and we must do the same for every range. This will give some 

 idea of the mathematical reasoning involved. It is not the 

 observed frequencies that we must deal with, but the relative 

 ones, and so each is divided by the total frequency, thus//w is 

 taken instead of/. Obviously the samples are of different size, 

 and this is necessary. Also the negative divergences (less than 

 the mean) must be added to the positive ones (greater than the 

 mean) and so we have to square the relative frequencies, so that 

 all of them will be positive. Fmally, we calculate the sum : — 



Sum 



\ n n}J 



L 7+r 



= s. 



f and f^ are the two series of frequencies and 7i and n'^ the 

 two total frequencies. Having obtained s we calculate 

 X^ — ^ X w X n^. We look up x^ ^^^^ ^'l^o the number of 

 classes in our distributions in Pearson's Tables for Statis- 

 ticians and so get a value, P. This gives us the probabihty 

 that the two distributions that we compare are both 

 representative of the same population and that the differences 

 between them are only due to errors of random sampling. 

 When P = 0-1 the chances are 1 in 10 that differences as great 

 or greater than those observed are due to random sampling ; 

 if P is 0-01 the chances are 1 in 100, and so on. Let, for 

 instance P = 0-1, then once in every ten times, or samples, 

 random sampling would give differences as gieat as, or greater 

 than, those we observe. 



Elderton, in Frequency-Curves and Correlation, p. 144, 

 paragraph 7, says : — " The only other point to which reference 

 is necessary is the actual value of P at which a good fit ends 

 and a bad one begins. It is impossible to fix such a value. We 

 have merely a measure of probability for the whole table, and if 

 the odds against the graduation are twenty or thirty to one 



