\\.\II \\.\IV] Introduction H\ 



It will thus bo clear to the reader that the no-called "probable error" of r, or 



67449 



(1 - r*), contains no very reliable information ,1* t,<> tin; accuracy with which 



Vn -1 



r represents p when : 



(i) the sample is "small " for any value of p; 



(ii) the sample is considerable lor high values of p\, say '75, or over. 



For samples of two the frequency distribution of r is a double lump; for those 

 of three a U-shaped curve, for those of four a J-curve ; for those of five onwards 

 at first markedly skew distributions only approaching a normal distribution for low 

 values of p as the size of sample increases, but for high values we have distribution 

 curves, which even for considerable samples diverge much from the normal curve. 



'67449 

 When n is large the "probable error" of r rnay be taken as - (1 r 1 ) just 



vn 



67449 1 



as legitimately as (1 7 12 ), because in deducing this latter value terras in - 



Vn.- 1 * 



have already been neglected. 



The student is strongly recommended when his sample is small and his correlation 

 coefficient is considerable to consult the appropriate column of Table XXX II 

 before drawing very dogmatic inferences as to his correlation coefficient. 



This table gives the ordinates at distances of '05 for r, and except for p = "8 or *9 

 and n = 18 and onwards enough ordinates are given to obtain a reasonable statistical 

 approximation to the areas, and thus to the chance of r falling within certain limits. 



These frequency curve ordinates were calculated with the intention of deducing 

 the areas of the curve from them and so forming a probability integral table for 

 n, p and r. It is hoped that this may be achieved eventually, but for an accurate 

 table so many more ordinates are requisite, that the great labour necessary has 

 hindered progress in this direction. 



The method of calculating the ordinates of the frequency curves for n = 25 and 

 under is fully described in the original memoir*. For n = 25 and upwards formula 

 (i) below, which is an approximation, gives excellent results and the formula i-\vn 

 for n = 10 is good enough for practical purposes. Below that it is less reliable and 

 requires more terms. If y (n, r, p) be the required ordinate, i.e. the ordinate at r in 

 a set of samples of size n from a parent population of correlation coefficient p, then 



( P .r) 



(n 





where log x (p> r) = - (n - 1) log xi - log x* 



and l 



* JSitmulrikn, Vol. xi. pp. 329332. 



M2 



