By Student 
309 
When however there is correlation, I cannot suggest an equation which will 
accord with the facts, but as I have spent a good deal of time over the problem 
I will point out some of the necessities of the case. 
(1) With small samples the value certainly lies nearer to zero than the real 
value of R, e.g. 
samples of 2 : mean at - sin~' R, 
^ TT 
samples of 4 (real value "GG) "oGl* ± "Oil, 
samples of 8 (real value "66) "Gl^-f- + -065. 
But with samples of 30 (real value -06) mean at "6609 + -0007 shows that the mean 
value approaches the real value comparatively rapidly. 
1 - ?■- 
(2) The standard deviation is larger than accords with the formula — - 
\n — 1 
even if we give the mean value of r for samples of the size taken, e.g. for 
samples of 2, 
S.D. = A / 1 — f - sin~i R 
For samples of 4, calculated + -3957 + -0009 ; actual -4680, 
8 „ -2355 + -0041 ; actual -2684. 
But samples of 30 calculated -1046 ± -0018, actual -1001, again show that with 
samples as large as 30 the ordinary formula is justified. 
(3) When there was no correlation the range found by fitting a Pearson curve 
to the distribution was accurately 2 in the theoretical case of samples of two, and 
well within the probable error for empirical distributions of samples of 4 and 8. 
But when we have correlation this process does not give the range closely for the 
empirical distribution (samples of 4 give 2-137, samples of 8 2'699, samples of 
30 infinity) and the range calculated from samples of 2, which is 
2 V4 + 3/^3 + 18/^2- -9/x, « 
fwhere fi.2=l — (- sin"^ R\ \ is always less than 2 except in the case where /x., is 1, 
\7r 
i.e. when there is no correlation. 
Hence the distribution probably cannot be represented by any of Prof. Pearson's 
types of frequency curve unless R — 0. 
(4) The distribution is skew with a tail towards zero. 
* The value must be slightly larger than this (perhaps even by -03) as Sheppard's corrections were 
not used. 
t Again higher, but not by more than '02. 
1 -r2 
+ , where r is taken as the mean value for the size of the sample. If we took the real 
value R, the difierence would be even greater. 
