PEOBABLE ERROR OF A CORRELATION 
COEFFICIENT. 
By student. 
At the discussion of Mr R. H. Hooker's recent paper "The correlation of the 
weather and crops" (Journ. Royal Stat. Soc. 1907) Dr Shaw made an enquiry 
as to the significance of correlation coefficients derived from small numbers 
of cases. 
His question was answered by Messrs Yule and Hooker and Professor Edgeworth, 
all of whom considered that Mr Hooker was probably safe in taking '50 as his 
limit of significance for a sample of 21. They did not, however, answer Dr Shaw's 
question in any more general way. Now Mr Hooker is not the only statistician 
who is forced to work with very small samples, and until Dr Shaw's question has 
been properly answered the results of such investigations lack the criterion which 
would enable us to make full use of them. The present paper, which is an account 
of some sampling experiments, has two objects : (1) to throw some light by empirical 
methods on the problem itself, (2) to endeavour to interest mathematicians who 
have both time and ability to solve it. 
Before proceeding further, it may be as well to state the problem which occurs 
in practice, for it is often confused with other allied questions. 
A random sample has been obtained from an indefinitely large* population 
and ?•-(- calculated between two variable characters of the individuals composing the 
sample. We require the probability that R for the population from which the sample 
is drawn shall lie between any given limits. 
It is clear that in order to solve this problem we must know two things : (1) the 
distribution of values of ?• derived from samples of a population which has a given 
* Note that the indefinitely large population need not actually exist. In Mr Hooker's case his 
sample was 21 years of farming under modern conditions in England, and included all the years about 
which information was obtainable. Probably it could not actually have been made much larger 
without loss of homogeneity, due to the mixing with farming under conditions not modern ; but one 
can imagine the population indefinitely increased and the 21 years to be a sample from this. 
t Throughout the rest of this paper "r " is written for the correlation coefticient of a sample and R 
for correlation coefficient of a population. 
