24 



SCIENCE 



[N. S. Vol. XXX. No. 757 



how new values for the correlation coefficient 

 are related to old values, in a multitude of 

 formulae leading to divergent and possibly in- 

 consistent results. 



Dr. Boas's first value for r 



is a very old friend indeed and has been widely 

 used in a multiplicity of practical cases. It is 

 one of a general series of formulae noted by 

 me^ in 1896, and used in our work on the per- 

 sonal equation^ in 1902 and on wasps' in 1906. 

 Since 1896 it has been frequehtly referred to, 

 e. g., in the memoir " On Further Methods of 

 Determining Correlation '" and Biometrika, 

 VI., p. 438, etc. It is quite reliable and often 

 convenient. 



Dr. Boas's second formula 



^_ \.{x, + X, + --- + Xn)Y 1^ 



n(n — \)t!^' n — 1 



suffers from the difficulty that in the form in 

 which he gives it, it involves the number in 

 the fraternity, being taken as constant, where- 

 as in practise we may often have five in one 

 fraternity and ten in a second. Its chief 

 value is when n is very large as in long series 

 of homotypic characters, or in series other 

 than man when the number of offspring is 

 very great. In such cases the second term 

 !/(?!.— 1) is usually of the order of our prob- 

 able error and may be neglected and n — 1 

 may be taken =n, within the same limits. 

 Thus: 



(S.D. of means of fraternities)* 



(S.D. ot population)^ 



Under this aspect it is easy to extend the 

 formula to eases in which n is not the same 

 for each fraternity. A like formula was used 

 in 1898 for our studies on the inheritance of 

 fecundity of thoroughbred horses." It has 

 been since employed in various homotypic 

 investigations. It must be very carefully dis- 

 tinguished from that for the correlation rates 



'■Phil. Trans., Vol. 187 A, p. 279. 



'Phil. Trans., Vol. 198 A, p. 243. 



' Biometrika, Vol. V., p. 409. 



* Dulan & Co., Drapers's Research Memoirs. 



^Phil. Trans., Vol. 192 A, p. 272. 



S.D. of means of arrays 



S.D. of popnlation 

 where rj = r for normal correlation. 



The arrays in the latter formula contain 

 many fraternities, and their means have far 

 less variability than that of those of fra- 

 ternities. 



Lastly I come to Dr. Boas's formula 



pi, 2 Pl?! 



1/Pl(l— Pl)P2(l— P^) 



If we have a fourfold table represented by 



I find Dr. Boas's r is our old friend 



ad — be 



rbk = . - , 



V(a + b){c + d){a + c){b + d) 



i. e., is the correlation in the deviation of 

 the mean of one variable from its mean value 

 with the deviation of the mean of the second 

 variable from its mean value. It is not a true 

 correlation of the first variable with the second 

 variable. I have discussed fs^ at length in 

 memoir of 1900:° 



It has the advantage of a symmetrical form and 

 a concise physical meaning. It does not, however, 

 become unity when either, but not both ft and c 

 vanish, nor does it, unless we multiply it by Tr/2 

 and take its sine, equal the coefficient of correla- 

 tion when a = d and 6 ^ c. 



Thus it differs in the simplest cases from 

 the true coefficient of correlation, and often 

 differs considerably. In the bulk of cases it 

 does not approach r nearly as closely as the 

 " Q^ " coefficient of association, and its use is 

 liable to be misleading, especially if compared 

 with values of the true coefficient found by 

 other processes. 



When there is a measurable quantity 

 grouped in arrays under classes of a non- 

 measurable quantity the right method, I ven- 

 ture to think, is to use the correlation ratio i; 

 as defined above. This will be equal to r if 

 the correlation be normal, and if not it has a 

 perfectly definite physical meaning of its own.* 



" Phil. Trans., Vol. 195, pp. 12 and 15 bottom. 



' " On the General Theory of Skew Correlation, 

 etc.," Drapers's Research Memoirs, p. 10. 



