Karl Pearson 
117 
is the equation to the regression line ; therefore 
Whence it follows that 
X — X _ °"x g x g - 
8 <?x 
or the correlation of a variate with its class-mark is the ratio of the standard 
deviation of the means of class-marks to the standard deviation of the variate. 
It follows from this that in classifying a variable into broad categories the test 
of their efficiency, as far as number and arrangement are concerned, lies in the 
standard deviation of their means not differing widely from the standard deviation 
of the variate itself. 
(2) Clearly o%=S(n&)/N (ii), 
where S is a sum involving all classes. 
If before classifying into broad categories we have made a quantitative 
determination of these classes on a sample frequency, the values of x s 's can be 
determined. If this has not been done, or cannot be done, we are bound to 
assume some form of frequency distribution. Suppose we take a normal distribu- 
tion, then we know that 
-= — " xe 2 <r 2 dx = Na x (z s — z s ), 
v2,tov'-%, 
where z = —== e * (■''1°'^, 
V2tt 
and can be found from Sheppard's Tables of the ordinates of the normal curve 
as soon as n g , etc. are known, for the z's are the ordinates at start and finish of the 
sth class, reduced by the factor Nj<r x . 
Hence 
Thus r xC can be found at once from Sheppard's Tables, when the totals of the 
broad classes are known. 
(3) Let us now suppose a second variate y and assume that the correlation 
of x and y for practical purposes is linear. Then clearly since a given x will have 
a constant class-mark, the correlation of y and G x for a constant x is zero ; that is 
to say that the partial correlation coefficient 
