376 Prof. Karl Pearson on the Influence of 



Table II. 



Percentage. 



1 I 

 i-, I Sum of 

 Frequency. -& 



1 v 1 p requency. 



Percentage. 



^ " ' Frequency". 





 1 

 2 

 3 

 4 

 5 



132-8 

 258-2 

 260-6 

 181-6 

 98-2 

 43-8 



132-8 

 391-0 

 651-6 

 833-2 

 931-4 

 975-2 



6 



ti 

 i 



8 



9 



10 



16-8 

 5-7 

 1-7 

 0-5 

 0-1 



992-0 

 997-7 

 999-4 

 999-9 

 1000-0 



Total ... 

 ! 



10000 







arranged as in Table I. We see at once that there is no 

 approach to a Gaussian curve. If we proceed by simple 

 interpolation we find the median at 1*919, the first quartile 

 at *954, and the second at 3*042; these give a probable error 

 in defect o£ *965 and in excess o£ 1*123, or a mean probable 

 error of 1*044, somewhat greater than the value deduced 

 from the standard deviation or 1*013, and considerably greater 

 than that given by the usual theory. Here, as in the last 

 illustration, we see a sensibly larger chance of the second 

 sample exceeding than of its falling short of the percentage 

 given by the long first sample. We might perhaps express 

 this best by saying that the expectancy is 50 p. c. of samples 

 with a percentage of the character between and 1*92 and 

 50 p. c. with a percentage between 1*92 and 10. 



To show the sort of erroneous inference that may be drawn, 

 I suppose a percentage of 6 to have been observed in a second 

 sample. Table II. shows that a percentage as large as this 

 actually occurs with a frequency of 24*8 in the thousand, 

 or its improbability is only measured by 39 to 1. On the 

 other hand, if we adopt a Gaussian distribution we should 

 only expect a deviation as large as 4 from the mean to occur 

 in excess 3*5 times in the 1000 trials ; or the odds are 285 

 to 1 against its occurrence *. Such long odds would reason- 

 ably lead us to suppose some modification in the population, 

 since the first sample was drawn. The error of such a sup- 

 position really flows, however, from the assumption of a 

 normal frequency distribution. 



Looked at analytically we find : 



A = *5836, /3 2 == 3*6468; 



or the curve is very sensibly skew and leptokurtic. 



We find /C! = *4572, >0, or the curve is of Type I. We get 



r=27*0761, 6=87-3453, and ^ = 2*7435, m 2 = 22*3326; 



* 475 to 1 against so large a deviation, if we adopt the old theory, for 

 which, such a deviation is 2S6 and not 2-70 times the standard deviation. 



