Karl Pearson and David Heron 
177 
was independent of treatment. If n be the total number of observations, then, if 
11$- = p£ 2 be calculated, the chance of independence is at once given by finding the 
P corresponding to ^ 2 from the tables for "goodness of fit" in the case of n= 4. 
The column of P in the table of p. 173 shows us that the Homerton-Fulham data 
stand at the head of the series in this respect. This is of course in great part due 
to the larger numbers dealt with, but obviously in such a question as the effective- 
ness of treatment weight must be given to numbers. 
If we turn to the problem of r t and C 2 considered merely as coefficients of 
association, we must examine what Mr Yule has laid down as the desiderata of such 
a coefficient. The fourfold table beinc 
d 
, he has assumed that the coefficient 
must range from — 1 to + 1, and that if any one of the cells be empty the coefficient 
ad — be 
or another 
Vad — V&c 
\/ad + \/bc 
must be + 1 or — 1. He then guesses a formula 
& ad + be 
out of the many thousands which can be invented*. 
But is either of the assumptions above really necessary ? Why should the 
association be perfect when 6 = 0? Why should it be perfect even if both b and c 
are zero ? Let us toss a shilling and a penny and record heads or tails of both. 
Shilling. 
Head 
Tail 
Totals 
Head 
1 
0 
1 
Tail 
2 
1 
3 
Totals ... 
3 
1 
4 
We do it four times and the result is as above, on the whole not a very improbable 
result. But according to Mr Yule there is absolute association, and since the 
probable error according to him is zero, the result is absolutely reliable. Clearly 
* See Pearson, Phil. Trans. Vol. 195 A, p. 15. If <p (z) be a function which vanishes with z, 
then any form 
\<p(ad)-<p(be)}l{<t>(ad) + <t>{bc)\ 
satisfies Yule's requirements. Or, we can take a form 
1 + k <j>(a>) 
if (j>{K) be finite for k = <x> , where K = (bc)j(ad). Clearly for the range 0 to oo of values of k, Q k ranges 
from + 1 to - 1 and satisfies Mr Yule's conditions if <f> (oo ) is ><p (k), but by an arbitrary choice of cf> we 
can get any form of Q-curve we please. No curve of real significance can be obtained, i.e. no reason- 
able value of an association coefficient by the simple condition of fixing three of its values without 
other hypothesis ! 
Biometrika ix 23 
