Miscellanea 
439 
Substitute also in (i) : 
1 " 
p = - S (b p r Pfn+1 ), 
P. i 
- 8 , , 
1 ''n + i,ji + i 
but : R=r 1> ' n+1 R h n + >'•>, « + 1 -^2, » + 1 + • • • + + 1, n + 1 • 
p , A ,. . 
Hence : p'=L— (ivj, 
+ + 1 
1 _ 2 _ 1 i C (-Rp,n + l r p , n + l ) R 
P ~ + R ~~ ft 
1 - tl ii+l,»+l + n + l 
Hence we have the following results : 
u= — 
g -"-p, n + 1 
P ll A,[ + 1>)[ + 1 (Tp^ 
and o-„ + 1 -p 2 = o-, l + 1 a/ o — (vi), 
and is the reduced average variability of x n + 1 for given values of $ lt a'.,, — x n . 
The probable value of x„. + 1 is given by 
x u + - l -x n + 1 = a " + 1 p(u-u) 
= -S / p '" + 1 ^ (vii), 
1 "n + l, » + ] °i> 
i.e. the ordinary multiple regression formula. It is the " best value," i.e. the mean value of 
x n+1 , for given x 1 ...x ll , on the assumption that we correlate x n + l with that linear function 
of the n variables, which gives the highest degree of relationship as measured by the correlation 
coefficient. The method is absolutely independent (i) of Gaussian theory, (ii) of the continuity 
or discreteness of the variables, but it does assume that linearity applies within the degree 
of useful approximation*. 
Another point deserves re-emphasising here. Equation (iv) gives p 2 , hence whether p be 
plus or minus, the errors of random sampling will always give a positive p 2 . It follows therefore 
that even if p be zero, we should find on making a number of trials in each case a positive value 
of p 2 ; let the mean value of this be p 2 , then unless the actual value of p 2 is significant not 
as compared with zero, but with p 2 , no value ought to be laid on the actual value of p 2 . The 
* The general linearity ought to be tested in all such cases. Nothing can be learnt of association by 
assuming linearity in a case with a regression line (plane, etc.) like A, much in a case like B. To A 
we must apply multiple correlation-ratios, the theory of which is being developed at the present time 
and will shortly be published. 
