ON THE PROBABILITY THAT TWO INDEPENDENT DIS- 
TRIBUTIONS OF FREQUENCY ARE REALLY SAMPLES 
OF THE SAME POPULATION, WITH SPECIAL REFER- 
ENCE TO RECENT WORK ON THE IDENTITY OF 
TRYPANOSOME STRAINS 
By KARL PEARSON, F.R.S. 
(1) In Biometrika, Vol. viii. p. 250, I discussed fully the mathematical 
process requisite for measuring the probability that two indepeudent distributions 
of frequency are really samples of the same population. As far as I am aware this 
is the only complete theory of the subject which has been published. I believe it 
to be scientifically adequate, and it has already been applied to a large number of 
problems*. 
Before that paper was published, it had been usual to compare any constants of 
two frequency distributions together, and by a due consideration of their difference 
relative to the combination of their probable errors to determine the probability of 
the identity of those constants. This could be repeated for any number of corre- 
sponding constants, and if theoretical curves of frequency had been fitted, their 
divergence or correspondence measured by the divergence or correspondence of 
their complete series of constants. The method above referred to, however, as 
based on the general theory of sampling, calls for no hypothesis as to the general 
theory of frequency. It takes the observed distributions and measures the prob- 
ability that both are samples from a large population. The population may be 
homogeneous or heterogeneous ; jirovided the samples are truly random samples 
we obtain a measure of the probability of their common origin. 
In the course of a long statistical experience I have learnt that it is wholly 
impossible to reach any safe conclusions as to the identity or non-identity of 
populations by any process of mere graphical comparison of frequency distributions. 
* In actual practice the %" test of "goodness of fit" should always be made with not too tine group- 
ing at the terminals, especially when any group in the tails appears to be contributing largely to the 
total of x^. This point was recognised ah initio (Phil. Mag. Vol. l. p. 164), and has recently been 
re-emphasised by Edgworth, Journal It. Statistical Society, Vol. i.xxvii. p. 198. 
