Sec. 7.4 



COEFFICIENTS OF LINEAR CORRELATION 



219 



samples of pairs of values of X and Y from Table 7.21 for which 

 p = .749. This population was considered to be approximately a 

 normal bivariate population. The 190 sampling r's thus obtained 

 are summarized in Table 7.41 along with the corresponding g's (see 

 discussion of z below Figure 7.41). The distribution of r for a p so 

 large as this is definitely skewed, as can be seen to some extent in 

 Table 7.41 and in Figure 7.41. 



-.20 



.20 



.40 



.60 



.80 



r or z 



1.00 1.20 1.40 1.60 



1.80 



Figure 7.41. Sampling frequency distribution of the correlation coefficient, r, 

 and of the corresponding z = (1/2) log p [(1 -f- »0/(l — >')]. n = 12. 



It was found by R. A. Fisher that under these circumstances it is 

 helpful to use the following function of r: 



1 +r 

 Ll - rJ 



(7.44) z = (1/2) log e \^A = (2.30259/2) log 10 



ri + ri 



= 1.1513 logio • 



Ll — r. 



because its sampling distribution is essentially normal in all important 

 features even when p is definitely ^ 0. Moreover, its variance is 

 given by <r z 2 = l/(n — 3). This is not a sampling estimate but the 

 true variance of z. It follows that, as a good approximation, the 

 quantity y = (z — z p )/a z , where z p is the z corresponding to p in 

 (7.44), is normally distributed. Hence, Table III gives the probabili- 

 ties needed in tests of hypotheses regarding p or in the calculation of 



