22 



KARL PEARSON 



among the four categories in the same proportions, we increase in the same ratio x\ 

 and also decrease P, but we do not modify r^. Our scale therefore of appreciation 

 ought to allow for this factor in F, and this is done when we reckon Vp by considering 

 the relation of r to ocr,, which varies with the size of the population. 



Analysis of Results of Illustrations. 



(iii) Cj and r^j. are of course free from this objection, but they are absolutely 

 incomparable with true coefficients of correlation ; the former because the coefficient 

 of contingency must be based at least on a 4 x 4, and better a 4x5 or 5x5, table, 

 before it approaches r, and the latter because r^j. = (f> is never equal to r, by its very 

 definition and nature. 



I pointed this out many years ago when first dealing with 7-,^^,!. Quite recently 

 Mr G. U. Yule has reintroduced r^j. under the novel name of a " theoretical value " 

 for the correlation coefficient of a fourfold table. I am unable to see why it should 

 be a "theoretical value," as it seems so far as I can follow Mr Yule's deduction to 

 involve, when deduced by his method, a very arbitrary relationship between the 

 standard deviation and the position of the mean of each subrange in the case of both 

 frequencies. Like C^, r^j. may even displace the true order of relationship in the 

 series. I do not think that r,^ can be used, as Mr Yule suggests, as a measure 

 of association, at any rate it is a measure wholly incomparable with true correlation, 

 and it is quite possible — out of the indefinitely large number of measures of 

 association— to select one practically as easy of determination and which does 

 approximate to the true correlation t 



* This value is insignificant as compared to its probable error. 

 t An Introduction to the Theory of Statistics, p. 212. 

 a h 



X Given as our fourfold table, the correlation is not necessarily perfect in actual practice, if 



c d 

 either h alone or c alone be zero. This is quite clear if the distribution be Gaussian, and the dividing 

 lines of the classification be taken so as to meet on the elliptic contour of the frequency surface which 

 contains inside itself the whole volume of the population. Thus in practice it is quite possible to obtain 

 (?, = 1, where the correlation is small or even zero. 



