M. Greenwood and J. D. C. White 
509 
Recognising the force of this objection, it was decided to examine the effect of inchiding 
clumps in the count. To this end, anotlier 2000 cells wei'e counted on the slides already used 
and clumped cells were included. Of the 2000 cells counted in this way 36, or 1'8 i)er cent., 
contained clumped bacteria. Now if we allow, as seems reasonable, that this latter enumeration 
was a fair sample of the population, then the frequencies in our 20,000 cells would be changed 
in the manner shown in the second column of Table I. We have distributed the clumped cells 
among the other frequencies above zero in accordance with the proportions of those frequencies 
in the original count, since the small number of clumped cells which were present in the 2000 
did not allow us to determine the true proportion of clumps, whether mainly of two, three, 
or more bacilli*. 
The alteration which is effected by this correction is so trifling, that our erroneous method of 
counting, if it be considered an erroneous method, can hardly have been the cause of the poor 
fit which resulted, and we have not thought ourselves justified in discarding the data originally 
collected. Some other source of the poor fit must be looked for. 
Other possible explanations are : 
(1) The existence of heterogeneity in the material dependent on the fact that cells from 
different parts of the films are not, or may not be, strictly comparable with each other. 
(2) An artificial heterogeneity dependent on the process of counting, other than that already 
discxissed. 
It has been noticed before that a large sample of material, even when adequately described 
by a frequency curve from the diagrammatic point of view, fails to satisfy the approved test. 
Indeed, many statisticians have been, as Pearson has remarked, far too easily satisfied with the 
test of mere inspection. 
Elderton writes : " I have found in applying the test, that when numbers dealt with are very 
large, the probability is often small, even though the curve appears to fit the statistics very 
closely. The explanation is that the statistics with which we deal in practice nearly always 
contain a certain amount of extraneous matter, and heterogeneity is concealed in a small 
experience by the roughness of the data. The increase in thcnumber of cases observed removes 
the roughness, but the heterogeneity remains. The meaning, from the curve-fitting point of 
view, is that the experience is really made up of more than one frequency curve, but a certain 
curve, approximating to the one calculated, predominates t." 
It will have been noticed that the poverty of fit is mainly due to the cells containing 
one bacillus being in defect and those containing three in excess ; these two groups have added 
nearly 58 to the value of x^. Now the work of counting is excessively monotonous, and after 
going through some hundreds, it seems impossible to escape an impression that a certain 
measurement, say 3 bacilli per cell, is modal. Hence a tendency will arise to place any doubtfuls 
within that particular class. If the count be limited to one or two thousands, the error so 
introduced may not appreciably affect the result, but it will do so if the data run to many 
thousands, since the same ijercentage deviation in a large as in a small experience has naturally 
a much greater influence on the fit. 
* The 36 cells which contained " clumps " were actually distributed as follows : 
Number of Number of Number of Number of Number of Number of 
Bacilli Cells Bacilli Cells Bacilli Cells 
1 2 4 4 7 5 
2 6 5 8 8 1 
3 8 6 0 9 2 
t Frequency Curves and Correlation, by W. Palin Elderton, p. 142. 
Biometrika vii 
65 
