\ 
178 Miscellanea 
see from (vi) that r is an absolute maximum. Clearly brjr is always negative even for inter- 
changes between arrays at considerable distances. Or, we conclude that if there be one arrange- 
ment of the material for which the regression line is linear, then any interchanges, however 
extensive, will reduce the value of the correlation as calculated by the product moment method. 
This conception of the linear regression line as giving the arrangement with the maximum 
degree of correlation appears of considerable philosophical interest. It amounts practically to 
much the same thing as saying that if we have a line cla.ssification, we shall get the maximum of 
correlation by arranging the arrays so that the means of the arrays fall as closely as possible on 
a line. 
Further, if the mean square of the interchanges, i.e. the expression 
be small as compared with the standard deviation squared, i.e. a^^, then the change 8r will not be 
sensible. In other words small changes in the scale ordering, not confined to adjacent or even to 
two arrays, will not sensibly modify the correlation as found by the product moment method. 
Lastly, considering the proof of (vi) we see that no portion of the investigation is dependent 
on the whole of the one ?/-array being interchanged with the whole of another. We may consider 
Vs8>/g and Vg>bi/s' as only portions of the total array — to be taken, however, proportionately from 
all its constituents. Now let Vg8//g and I's'Sj/g- denote the whole of the frequency of the two 
arrays, and write the first array Vg8i/g + ^m - and the second array Fg-Sj/j' — ^w!.-|-|m. Now 
ti-ansfer the -im of the first array to the position of the second and the +im of the second to 
the position of the first, i.e. take Vg8i/s= -^m and Vs'^i/a-= +hin ; it follows that Vgb'yg-\-Vgibys' = 0 
and the two arrays are 
Vgbi/g-'tm and Fj-8?/s'-»i, 
i.e. exactly the values they would have had if a portion of the second array drawn at random 
from all its sub-grouj^s had been inscribed in the same sub-groups of the first array. But in this 
case we see since Vg8ys-\-Vg,hi/si = 0, that (vi) will give us absolutely 8r--0, or there will be no 
change in the correlation. This result seems of considerable value. Suppose the regression 
linear, and one character, x say, easily measured or known ; then if a number m of individuals 
which ought to fall into a given class of y, be shifted by oversight or error of judgment into a 
second erroneous class of y, this will not sensibly affect the correlation, if N being the total 
frequency, the square of the ratio in/iV is negligible, as compared with its first power. Thus 
suppose in correlating age with hair tint, the first character being accurately known, an observer 
were to place his series of contributory observations of hair tint in the wrong group, say in one 
of the brown reds instead of pure browns, this would not sensibly modify the resulting correlation. 
The fact that the error would not produce a modification is not in the first place due to the 
possible smallness of the misplaced group. The product moment is changed and the standard 
deviation is also modified, but the modification of the correlation depends on such manner on the 
changes of these two, that they act in opposite senses and cancel the modification, provided the 
original regression was strictly linear. 
While not desiring to encourage carelessness in observing or tabling or in the formation of 
scale orders without due consideration, still the results of this note seem to indicate that in 
many cases absolute unanimity of judgment in classifying or great stress on small details of scale 
grouping are not needful in order to reach sensibly identical values of the correlation. This 
view coincides with my actual and not unique experience, when having been in grave doubt as to 
where 30 or 40 individuals were to be placed, I put them first in one category and then in a 
second, only to find out that the correlation worked out with the group first in one and then in 
the other category was sensibly identical. The theorems developed in this note seem to explain 
this stability — when we use not contingency but product moment methods, and suppose the 
regression ultimately linear. 
