44 
Notes on the History of Correlation 
I should sum up Edgeworth's work of 1S92 by saying that he left the problem 
of multiple correlation at least in a very incomplete state. He probably knew what 
he was seeking himself, but he did not give the requisite attention to the wording 
or printing of his memoir to make it clear to others, and accordingly in looking 
back at the matter now I am very doubtful whether in 1895 I ought to have called 
the problem of multiple correlation, " Edgeworth's Problem." He certainly did not 
put the answer to it in a form in which the statistician with a customary amount 
of mathematical ti-aining could determine the form of the surface for n variates, as 
soon as their s. D.'s and correlations had been calculated. I think I am justified in 
saying this for I have not to my recollection come across any treatment of multiple 
correlation which starts from Edgeworth's paper or uses his notation. 
It will be seen from what has gone before that in 1892 the next steps to be 
taken were clearly indicated. They were, I think, 
(a) The abolition of the median and quartile processes as too inexact for 
accurate statistics. 
{h) The replacement of the laborious processes of dividing by the quartiles 
and averaging the deduced values of r, hy a direct and if possible 'best' method 
of finding r. 
(c) The determination of the probable errors of i- as found by the ' best ' and 
other methods. 
{d) The expression of the multiple correlation surface in an adequate and 
simple form. 
These problems were solved by Dr Sheppard or myself before the end of 1897. 
Closely associated with these problems arose the question of generalising 
correlation. Why should the distribution be Gaussian, why should the regression 
curve be linear ? 
As eai'ly as 1893 I dealt with quite a number of ' correlation tables for long 
series and was able to demonstrate 
(i) by applying Galton's process of drawing contoui's of equal frequency that 
most smooth and definite systems of contours can arise from long series, obviously 
mathematical families of curves, which are (a) ovaloid, not ellipsoid, and (h) which 
do not possess — like the normal surface contours — more than one axis of symmetry, 
(ii) that regression curves can be quite smooth mathematical curves differing 
widely from straight lines, 
(iii) that in cases wherein (i) and (ii) hold, homoscedasticity is not the rule. 
I obtained differential equations to such systems, but for more than 25 years 
while often returning to them, have failed to obtain their integration. 
This seems to me the desideratum of the theory of correlation at the present 
time : the discovery of an appropriate system of surfaces, which will give bi-variate 
skew frequency. We want to free ourselves from the limitations of the noi-mal 
surface, as we have from the normal curve of errors. 
