ON A NOVEL METHOD OF REGARDING ASSOGIATfON 11 



absolutely no valid theoretical objection to this method of reckoning the relation- 

 ship of two characters. Practically it suffers from the transcendent difficulty of 

 mentally appreciating the relative differences of indefinitely large improbabilities. 

 In order to surmount this -difficulty I propose to think in a scale of correlations; 

 I ask what r would have equal improbability if it arose from a random sampling 

 of Gaussian material at the same dividing lines. In a paper now in type I have 

 given tables from which the probable error of r can be readily found for a given 

 division. I use these tables to find „o-,. From an extended Elderton's Table for 

 "Goodness of Fit," I find log P. I then determine on what appears to be a reason- 

 able hypothesis a value of the correlation coefficient which would be equally 

 improbable ; and thus reduce my improbability of independence to a mentally 

 apprehensible scale. Thus the coefficient of correlation is merely used as a standard 

 of improbability, and we pledge ourselves to no hypothesis as to frequency distri- 

 bution, of which in many cases we know nothing. Still as we wish to approach 

 fairly closely to the actual value of the correlation when the distribution is Gaussian, 

 we select by preference a standard scale of correlation improbabilities, which wUl not 

 contradict Gaussian results, when the fourfold table is of that character. We should 

 not anticipate absolute agreement, for the reasons already stated, and it would be a 

 sufficient justification of our method, if the results obtained by it, when the material 

 is truly Gaussian, lie within a range limited by twice the probable error taken 

 on either side of the Gaussian value. 



We have next to consider how the frequency of correlation coefficients for an 

 actual value zero is to be distributed in large random samples. We cannot use a 

 normal curve of standard deviation otr,; for it is quite obvious that the tails of this 

 will extend beyond the limits — 1 to -I- 1, and although such a curve is quite legitimate 

 for ordinary probabilities in the neighbourhood of r = 0, it is wholly inadequate when 

 we have to ask for example what is the probability that r will equal 0'8, when 

 its actual value is zero, for say a sample of 1000. The curve of distribution of 

 r must be symmetrical about r = 0, and vanish for r= ±1. The only one of my 

 generalised frequency curves* which satisfies these conditions is Type II, i.e. 



y = y,{\-a?Y (xi). 



If N be the total frequency, N=yA (1 -s(?fdx, and 

 Nil, = N(T'^y,r\l- xj'a^dx 

 r'xd(x-^Y^ 



r^{\-xT^'dx 1 i^r T^ . 



1^^^^^ ^' = 2^6' °^™-K^^"') ^""^' 



* Phil. Trans. Vol. 186, A, p. 372. 



2—2 



