126 
BULLETIN OF THE BUREAU OF FISHERIES 
This is not the place to give a full description of the Analysis of Variance, but I propose to 
discuss shortly in nonmathematical language the assumptions on which it is founded, since I did 
inot myself find Fisher’s own description (Fisher (4)) very easy to follow, nor do I consider that the 
ogical bases of the method were sufficiently emphasized in the work cited. 
Supposing then for the moment that every single one of our vertebral counts is a sample from 
a strictly normal population of vertebral counts, we can take at random from the whole set of counts 
any given number of counts n/, and calculate for the set the mean number £i, and the variance, 
S(xi— x{) 2 w j iere ni = ni '_ i. This variance, which we will call SI, is an estimate of the variance a 2 
«1 
of the whole population, founded on the number of degrees of freedom n\. The term degrees of 
freedom means the number of ways in which the frequencies of the counts may be varied at will, 
provided that given fixed relations between the data are adhered to. In calculating S\ we have 
used xi as an estimate of the true mean, x\ being calculated from the data themselves. As £ h or 
the sum of all the n/ z/s, is fixed, only n\ — 1 of them may be varied at will, the last being fixed 
by the sum of the n/ — 1, whatever its value. 
We can take any other random sample of n' T counts and obtain another estimate s 2 r of the 
variance a 2 , based on n r degrees of freedom. We can also obtain s 2 , from our total set of N' counts 
and this is an estimate of a 2 based onV-1 degrees of freedom. All these values of s 2 may be shown 
to be efficient estimates of <r 2 , provided that our total set of counts is normally distributed, and our 
choice of subsamples is strictly random and not influenced in any way by the counts themselves. 
Besides these groups of counts chosen singly, we may separate all the counts into any number of 
n's of sets, containing, say, n\, n' 2 . . . n' r counts, and find their means £ u £ 2 . . . £ r . Then, 
if x be the grand mean of all the counts, 
m'iQei— x) 2 + n' 2 (x 2 — s) 2 + . • • +n' r (£ r — s) 2 
n',-1 
Sn v (£ p —x) 2 
or n'.-l 
is an estimate of tr 2 , based on n, degrees of freedom. We have here assumed that the individual 
counts in each set are concentrated at their means, and used the weighted means instead of the indi- 
vidual counts. This is perfectly allowable if the sets are random samples from a normal population. 
Thus we may analyse the total sum of squares in many ways on the assumption of normal 
distribution, the various sums of squares being divided (within errors of random sampling) pro- 
portionately to the degrees of freedom involved. 
Fisher has shown that if 
2 = log eJ or 1/2 (log e s 2 m — log c s 2 „) 
I S n 
when s 2 m and s 2 „ are two estimates of a 2 based on m and n degrees of freedom respectively, then 
2 is distributed in a known manner. If the value of z, found from two estimates of a 2 , calculated 
from data, lies too far away from the centre of this distribution, it may be reasonably concluded 
that s 2 m and s 2 „ are not estimates of the same variance cr 2 , indicating that the samples from which 
they were calculated were not randomly chosen but show an effect of the method of choice. This 
is the fundamental principle of the Analysis of Variance, but some of its applications involve very 
complex parceling of the data into groups, and groups within groups. In all these cases, the method 
is used to test suspected effects of particular ways of parceling the data, causing differences between 
some groups or means etc., and others, allowance being made automatically for variance within 
groups, i. e., for differences between the parcels which could arise by chance. The 2 -test offers 
a sound criterion for judgment as to whether the suspected effects are real or not, without personal 
opinions having the slightest influence on our judgment. The distribution of 2 is a function of 
n m and n„, the degrees of freedom involved in the two estimates of <r 2 to be compared. Fisher 
has tabulated (4) the 5% points and 1% points in the 2 -distribution for various values of n m and 
n„. If our value of 2 lies outside the 5% points, the two variances s 2 m and s 2 „ are considered signifi- 
cantly different; if z is outside the 1% point, the difference is considered doubly significant, as a 
value of 2 beyond our value would only occur by chance, if s 2 m and s 2 „ are not really different, 1 
