520 Distribution of the Correlation Coefficients of Samples 
12. The fact that the mean value r of the observed correlation coefficient is 
numerically less than p might have been interpreted as meaning that given 
a single observed value r, the true value of the correlation coefficient of the 
population from which the sample is drawn is likely to be greater than r. This 
reasoning is altogether fallacious. The mean f is not an intrinsic feature of the 
frequency distribution. It depends upon the choice of the particular variable r 
in terms of which the frequency distribution is represented. When we use t as 
variable, the situation is reversed. Whereas in using r we cramp all the high 
values of the correlation into the small space in the neighbourhood of r = 1, 
producing a frequency curve which trails out in the negative direction and so 
tending to reduce the value of the mean, by using t, we spread out the region ot 
high values, producing asymmetry in the opposite sense, and obtain a value t 
which is greater than t. The mean might, in fact, be brought to any chosen 
point, by stretching and compressing different parts of the scale in the required 
manner. For the interpretation of a single observation the relation between 
i and t is in no way superior to that between f and p. The variable t has been 
chosen primarily in order to give stability of form to the frequency curves in 
different parts of the scale. It is in addition a variable to which the analysis 
naturally leads us, and which enables the mean and moments to be readily 
calculated, and so a comparison to be made with the standard Pearson curves, but 
it is not, with these advantages, in a unique position. In some respects the 
function, log tan ^ + > is its superior as independent variable. 
I have given elsewhere* a criterion, independent of scaling, suitable for 
obtaining the relation between an observed correlation of a sample and the most 
probable value of the correlation of the whole population. Since the chance of 
any observation falling in the range dr is proportional to 
M - 1 ?i - 4 
(l-"')^*!--)" (sink)"" I'*- 
for variations of p, we must find that value of p for which this quantity is a 
maximum, and thereby obtain the equation 
^"^^^ Jo (cosh a; + cos 6'f -1 ~ n-1 Vsin 6 dd) 2 
we have ' f %^ 1(1 - p^ ^ ' , . [ = 0, 
'o Sp\ (cosha; + cos^)"-'l 
* E. A. Fisher, " On an absolute criterion for fitting frequency curves," Messenger of Blathematies, 
February, 1912. 
