314 
MR. R. A. FISHER ON THE MATHEMATICAL 
population from which, the sample is drawn, (2) how best to calculate from the sample 
estimates of these parameters, and (3) the exact form of the distribution, in different 
samples, of our derived statistics, then the theoretical aspect of the treatment of any 
particular body of data has been completely elucidated. 
As regards problems of specification, these are entirely a matter for the practical 
statistician, for those cases where the qualitative nature of the hypothetical population 
is known do not involve any problems of this type. In other cases we may know by 
experience what forms are likely to be suitable, and the adequacy of our choice may 
be tested a posteriori. We must confine ourselves to those forms which we know how 
to handle, or for which any tables which may be necessary have been constructed. 
More or less elaborate forms will be suitable according to the volume of the data. 
Evidently these are considerations the nature of which may change greatly during the 
work of a single generation. We may instance the development by Pearson of a very 
extensive system of skew curves, the elaboration of a method of calculating their para¬ 
meters, and the preparation of the necessary tables, a body of work which has enormously 
extended the power of modern statistical practice, and which has been, by pertinacity 
and inspiration alike, practically the work of a single man. Nor is the introduction of 
the Pearsonian system of frequency curves the only contribution which their author Las 
made to the solution of problems of specification : of even greater importance is the 
introduction of an objective criterion of goodness of fit. For empirical as the specifica¬ 
tion of the hypothetical population may be, this empiricism is cleared of its dangers if 
we can apply a rigorous and objective test of the adequacy with which the proposed 
population represents the whole of the available facts. Once a statistic, suitable for 
applying such a test, has been chosen, the exact form of its distribution in random 
samples must be investigated, in order that we may evaluate the probability that a 
worse fit should be obtained from a random sample of a population of the type con¬ 
sidered. The possibility of developing complete and self-contained tests of goodness of 
fit deserves very careful consideration, since therein lies our justification for the free 
use which is made of empirical frequency formulae. Problems of distribution of great 
mathematical difficulty have to be faced in this direction. 
Although problems of estimation and of distribution may be studied separately, they 
are intimately related in the development of statistical methods. Logically problems of 
distribution should have prior consideration, for the study of the random distribution of 
different suggested statistics, derived from samples of a given size, must guide us in the 
choice of which statistic it is most profitable to calculate. The fact is, however, that 
very little progress has been made in the study of the distribution of statistics derived 
from samples. In 1900 Pearson (15) gave the exact form of the distribution of x 2 , the 
Pearsonian test of goodness of fit, and in 1915 the same author published (18) a similar 
result of more general scope, valid when the observations are regarded as subject to 
linear constraints. By an easy adaptation (17) the tables of probability derived from 
this formula may be made available for the more numerous cases in which linear con- 
