182 
On Theories of Association 
reduce this infinite improbability by diluting it with a finite amount of non- 
correlated material ; and this approach of G 2 to unity is all that Mr Yule's artificial 
cock's comb surface illustrates. If A is an infinitely improbable event, then we 
shall not lessen the improbability of the whole by combining the event A with B 
which has a zero improbability* ! 
Mr Yule cannot determine the efficiency of contingency methods by simply 
asserting that the value of the mean square contingency depends on the number of 
cells ; it naturally alters with our increased knowledge, but this change may mark 
either an increase or a decrease according to the manner in which the material in 
the few cells is redistributed in their component cells. We assert that with a 
3 x 3-fold table you cannot get further than the contingency, and that further pro- 
gress can only be made by some other assumption as to the frequency distribution 
of the variates. With that assumption we think we shall be able to demonstrate in 
the course of this paper to the unprejudiced reader that the coefficient of contingency 
properly handled is, perhaps, the most powerful instrument of modern statistical 
theory. The assumption we make is that for correcting the results obtained by con- 
tingency, so that coefficients found for 3 x 3-fold, 4 x 4-fold, 5 x 5-fold, 8 x 8-fold 
tables may give practically identical results, it is sufficient to deduce the required 
corrections by using a Gaussian hypothesis to determine certain means. The method 
gives excellent results for the bulk of the distributions which occur in our wide 
experience of statistical work. If we can show that it gives good results for the 
extremely skew cases which Mr Yule has gone out of his way to cite, our point 
will be proved. Since the full development of the contingency method, fourfold 
tables have not been used by the Biometric School, except as controls, where 
contingency tables could be formed on the given data. But the statement that 
contingency was developed in order to overcome the difficulties of the fourfold 
table methods is directly disregarded by Mr Yule when he turns to our pigmenta- 
* A is the probability that in examining two absolutely independent variates, n cells shall be 
occupied and n 2 - n cells empty when we make n infinite. B is the probability that the n 2 cells shall 
each have their theoretically independent contents. No combination of these two events can give less 
than an indefinitely great improbability, i.e. C 2 = l. But we anticipate that if Mr Yule had not raised 
his cock's comb at such a conspicuous angle to the rest of his surface that its heterogeneity would 
be readily visible to the trained statistician, there would be no very serious error introduced by 
applying the mean square coefficient of contingency even to moderately heterogeneous material. We 
have not had leisure to investigate the matter closely, but if we superpose two Gaussian frequency- 
surfaces with identical means, with the same standard deviations for both variables, and with cor- 
relations ri and r 2 , then the true correlation by product moment is 
p=pr 1 + qr.>, 
where pN and qN are the total frequencies of the two components and p + q = 1. 
On the other hand 
q _ I p 2 (1 + r^) - 2pr t r 2 (r t + r 2 ) + r t 2 r 2 2 (1 + r t ?- 2 ) 
2 V p 2 (1 + n r 2 ) - 2p n r 2 + r 2 ) + »-i 2 r 2 2 (1 + n r 2 ) + (1 - n*) (1 - r 2 2 ) (1 - i\ r 2 ) ' 
If ri=l, ?' 2 = 0, as we know 02 = 1, while p=p. But if ri = -2, ?- 2 = - 7, with p— - 3, q = "7, the mixture pro- 
portions of Mr Yule's illustration, then p= '55 and C 2 = - 59. If ri = 'S, r-> — '7, p='3, q = 'l, then p = - 64 
and C 2 =-65. Again if i\ = -7, r->= -3, p= -4, q = -6, then p='46 and C 2 ='49. Thus it does not appear 
that small amounts of heterogeneity, not detectable on a study of the table, are likely to give very mis- 
leading values when C 2 is taken as a practical measure of p. 
