Student 269 
TABLE II. 
Mean p 
Mean 1\ 
Mean Ty 
Mean(r^+rj,) 
Original series 
.V grouped coarsely 
X and y grouped coarsely 
•5798 
•5798. 
•5696 
•2887 
•2903 
•2874 
1^92 
4-67 
4-67 
I -m 
1^90 
4^37 
3^82 
6-57 
9^04 
Here an increase in the correction to be made for ties from 3"82 to 9"04 has 
made a difference of '01 in the mean value of p, the probable error being about 
"015, and a still less appreciable difference in the Standard Deviation. It is, I think, 
a fair inference that the correction is applicable to the series in question, and the 
reason for the observed low values of p in much tied samples is to be sought else- 
where*. But it will be asked 'what if no correction be made for ties?' The answer 
is that the mean value of p will rise as the ties become more numerous and the 
S.D. will fall. Thus Table II would become Table III if no corrections were made. 
TABLE III. 
Mean p 
Mean (2'^+ r,,) 
Original series ... 
X grouped coarsely 
X and y grouped coarsely 
•602 
•616 
•622 
•2677 
•2887 
•2414 
3^82 
6^57 
9-04 
At iirst sight this may appear to be highly advantageous since the mean value 
approximates more nearly to the value which would be obtained from a large sample 
and the S.D. is smaller. A little reflection will show however that the means of 
the p's of all populations would be subject to the same rise and that in fact the p 
of one population is no more differentiated froiii the p of another population than 
it is when corrected, while the mean value when corrected is constant over a fairly 
wide range of ties. If the correction is not made p can be cooked up to any required 
value by increasing the ties. 
The fact is that as soon as there is a single tie, uncorrected p can no longer 
take all values between + 1 and - 1 and if one of the scales be reversed the cor- 
(T + T. ) 
relation instead of being - p becomes - p + — — f{ . We are therefore forced to 
' n(n^ — l) 
6 
use the coi-rection which after all gives us the distribution of p that we should 
get from ideal material containing no ties. 
* The low value of p for much tied samples is due to the fact that a much tied sample is as a rule 
one in which the s.d. of the original variables is low. 
Now as a matter of experience I find that of samples drawn from a normally distributed population 
those with s.d. above the average tend to give high and stable values of the correlation coefficient, 
while those with s.d. below the average tend to give low and variable values. 
The form of the correlation surface for variables 0-^. and is of considerable interest to those who 
have to deal with small samples and merits the attention of mathematicians. I hope to deal with the 
experience obtained from my samples at some later time. 
