420 
Miscellanea 
II. The Application of the Correlation Coefficient to Mendelian 
Distributions. 
By E. C. SNOW, M.A. 
In a paper* published in the Proceedings of the Royal Society of Edinburgh, Dr John 
Brownlee has employed various methods to determine theoretical values of the parental and 
fraternal correlations under special conditions, on the basis of the Mendelian formula?. We 
do not propose to deal with Dr Brownlee's conclusions, but only to draw attention to some of 
his methods. The importance of criticising them is not diminished by the fact that they may 
in some cases give correct results. It is, indeed, for this reason the more essential that they 
should be scrutinised, as their employment in other circumstances, where they may not give 
correct results, is rendered the more likely. 
The methods to be employed in determining correlation depend entirely on the nature of 
the frequency distributions dealt with. The following general types of distribution can be 
recognised : (a) continuous and quantitative, e.g. head length and many other anthropometric 
measurements, (b) continuous and yet for convenience of classification treated as qualitative, 
e.g. health and intelligence, (c) discontinuous and quantitative proceeding by equal steps, 
e.g. the number of veins on a leaf, the position of an individual within a family, (d) dis- 
continuous and quantitative proceeding by unequal steps, e.g. various botanical distributions, 
and the frequency within any grade of various salaries in government departments, (e) dis- 
continuous and qualitative, e.g. various types of occupations. 
The chief methods which have been discovered for the determination of correlation are: 
(i) The four-fold table method, which applies only when the table consists of two rows 
and two columns, and can only legitimately be employed when the distributions are perfectly 
continuous and at least approximately Gaussian. 
/ <£ 2 
(ii) The method of contingency (giving C 2 = /^J j^^Sj where <j> 2 is the mean square con- 
tingency). This can be applied whatever the number of cells and the nature of the distributions, 
but C 2 is only equal to r — the value of the correlation coefficient for the table — when the 
distribution is Gaussian and the number of cells is large. 
(iii) The ' product moment ' method. As all the cases with which we are concerned in the 
present paper are of tables of two rows and two columns or of three rows and three columns we 
need only discuss this method in relation to those cases. In both of them it is necessary that 
the observations should be supposed concentrated at points for each row and each column, and in 
the second of the cases it is necessary that the distances between these points for consecutive 
rows, and also for consecutive columns, should be equal, and also that the regression of each 
variable on the other should be linear. In the case of a two by two table, the value of r given 
by this method is 
2 _ (ad — be) 2 
r ~ \a + b)(c + d)(a + c)(b + d)' 
and this, for the same case, is also the value of <p 2 1. But this is not the same as the value of C 2 
which is usually taken as the measure of relationship when the method of contingency is 
* The Significance of the Correlation Coefficient when applied to Mendelian Distributions, Proc. Roy. 
Soc. Edin. Vol. xxx, Part vi. (No. 34). 
t See Drapers 1 Research Memoirs, Biometric Series i. p. 21. r is also the correlation between 
random deviations in the means of the two variates, when these deviations are expressed in terms of 
the standard deviations of the variates as units. See Pearson, Phil. Trans. Vol. 195, A, pp. 12 and 14. 
