124 Influence of "Broad Categories" on Correlation 
Let the division be into two categories containing N x |(1 + «) and N x \ (1 - a) 
individuals, a,nd let no assumption be made as to the nature of the frequency. 
Then 
but 
and therefore 
S ( j T x*J = -|(l+ a) + HI - a) x?, 
i(l + «)«•, + H 1 -«)a 2 = 0, 
Thus 
.(xix). 
^ / Hi-«) 
We note therefore that unless we suppose all the frequency in each category 
concentrated at one point — a very rare occurrence — r xC x w ^ always be less than 
unity, and therefore in correlating class indices a correction will always have to 
be made. We will consider two cases : (i) when the frequency distribution is a 
rectangle of length I. In this case x., is -=1 \\( \ + «)} and a x = l/^/S, we have then 
r x c x = ^Z x£(l+a)x£(l -«) 5 
(ii) the distribution of frequency is supposed to be Gaussian. In this case 
_ _ z a x ar x 
* 2 ~i(l-«)' 
Za. 
am 
r xC ~ 
.(xxi), 
V£(L-a) xi(l +«) 
in other words, r xC is the reciprocal of the ^ a already tabled in this number of 
Biometrika, p. 27. 
We have the following results : 
Value of i (1 + a) 
as percentage of 
total 
Value of 
Gaussian 
Frequency 
Rectangular 
Frequency 
•798 
•866 
60 7 
■789 
•849 
to 7. 
■759 
•794 
80 7 
•700 
•693 
90 7 o 
•585 
■520 
95 7 
•473 
■377 
99 •/„ 
•268 
•172 
It will be seen at once how rapidly the corrective factor l/r xCx rises as the 
division of the categories becomes more and more unequal. But in both cases 
very sensible correction is needful and its nature depends on the particular 
frequency. 
