220 LINEAR REGRESSION AND CORRELATION Ch. 7 



confidence intervals on p. For example, consider the first sample 

 drawn for Table 7.41. The n = 12 and r = .668; therefore, by formula 

 7.44, 



/1.668X 

 z = 1.1513 logio = 1.1513 logio 5.024 = 0.807, and 



V0.332/ 



(7 2 = 1/V9 = 0.333. 



Then, since y = (0.807 — 2p)/0.333 is a member of a standard normal 

 population, the probability distribution of Table III can be used. If 

 a confidence coefficient .95 is chosen, the inequality 



0.807 - z p 



-1.96 < < +1.96 



0.333 ~ 

 requires that 



0.154 <z p < 1.460 



unless a 1 in 20 chance has occurred in this sample. The corresponding 

 95 per cent confidence interval on p is obtained by using formula 7.44 

 and solving for the p, which now replaces r. Thus, 



1 /I + Pi\ 



z pl = lower limit = 0.154 = - log e I J ; or 



2 \1 — pi/ 



i±Jl = e 0.308 Butj logiQ ^0.308 = 308 logiQ e 

 1 - Pi 



= 0.308(0.4343) = 0.134; and 

 1 + Pi 



Anti-log 0.134 = 1.36 = 



1 - Pl 



Hence 2.36pi = 0.36 so that p\ = .155. Similarly p 2 = upper limit 

 of the 95 per cent confidence interval = .898; therefore, the 95 per 

 cent confidence interval on p is 



.155 < p < .898, 



which is a very wide interval but does include the true p, known in 

 this case to be .749. If a relatively narrow confidence interval is 

 needed, it is apparent that a rather large sample must be taken. 



Figure 7.41 shows that the sample correlation coefficient varies over 

 a considerable range even when p is as large as .749. As a matter of 

 fact one sample r out of 190 was negative in spite of the relatively high 

 positive correlation. This figure shows also, to a useful degree, the 



normalizing effect of the transformation z = - log e I ) . Given 



5 log «(r^)- 



