35"f DiMribution of Correlation Goeffident hi Small Samples 
approach to the normal curve is very slow, and the " probable error of the correlation 
coefficient," i.e. •^li^.'d {\ — r'^)jy/n as usually recorded, has very little worth. 
Models have been prepared to illustrate these points as follows : 
Model A gives for n = 2 to n = 25, the distribution of r for p = -6. 
Model B gives for » = 2 to w = 25, the distribution of r for p = -8. 
Model C gives for n = 3, the distribution of r for p = 0 to '9. 
Model D gives for n = 4, the distribution of r for p = 0 to -9. 
Model E gives for n = 25, the distribution of r for p = 0 to '9. 
Model F gives for n = 50, the distribution of /• for p = 0 to -9. 
(Further models are in process of construction for low values of n.) 
Even the photographs of such models form a striking warning of the dangers 
which arise (i) from small samples, and (ii) from judging results from even repeated 
small samples ; the modal value of the frequency distribution for the correlation 
of these will be very sensibly higher than the correlation of the sampled population. 
(8) On the Determination of the ''most likely" Valm of the Correlation in the 
Sampled Population, i.e. p. 
We now turn to another point. Suppose we have found the value of the 
correlation in a small sample to be r, what is the most reasonable value p to give 
to the correlation p of the sampled population 1 
Now we know that 
?t -1 
y^^^n- 2) illViL (1 _ ,2)¥ r — ^--^ , 
^" ^ ' 77 ^ 0 (cosh 2 - pr)«-i 
and if ^ (p) dp were the law of distribution of p's, we ought to make 
9 i (-00 7 
(1 _ . ^ (,) (1 - ^- f ^ ^^^^^^^^^ 
a maximum with p, or in other words deduce the value of p for a given r from 
n-l 
d fp (1-p^) 2 ^[p)dz ] 
dp \] ,1 (cosh z — pr)"^^ J 
Fisher puts </> (p) equal to a constant and then difJerentiating out reaches the 
equation 
(r -P cosh z)dz 
0 (cosh 2 -pr)« ~ ^ 
which should provide the value of p in terms of r. He solves this only to a first 
approximation, obtaining, 
"-'O-^') 
- 0 {1^)- 
