FOUNDATIONS OF THEORETICAL STATISTICS. 
straints are imposed upon the hypothetical population by the means which we employ 
in its reconstruction. The distribution of the mean of samples of n from a normal 
population has long been known, but in 1908 “ Student ” (4) broke new ground by 
calculating the distribution of the ratio which the deviation of the mean from its popula¬ 
tion value bears to the standard deviation calculated from the sample. At the same 
time he gave the exact form of the distribution in samples of the standard deviation. 
In 1915 Fisher (5) published the curve of distribution of the correlation coefficient for 
the standard method of calculation, and in 1921 (6) he published the corresponding 
series of curves for intraclass correlations. The brevity of this list is emphasised by the 
absence of investigation of other important statistics, such as the regression coefficients, 
multiple correlations, and the correlation ratio. A formula for the probable error of any 
statistic is, of course, a practical necessity, if that statistic is to be of service : and in 
the majority of cases such formulae have been found, chiefly by the labours of Pearson 
and his school, by a first approximation, which describes the distribution with sufficient 
accuracy if the sample is sufficiently large. Problems of distribution, other than the 
distribution of statistics, used to be not uncommon as examination problems in proba¬ 
bility, and the physical importance of problems of this type may be exemplified by the 
chemical laws of mass action, by the statistical mechanics of Gibbs, developed by 
Jeans in its application to the theory of gases, by the electron theory of Lorentz, and 
by Planck’s development of the theory of quanta, although in all these appli¬ 
cations the methods employed have been, from the statistical point of view, relatively 
simple. 
The discussions of theoretical statistics may be regarded as alternating between 
problems of estimation and problems of distribution. In the first place a method of 
calculating one of the population parameters is devised from common-sense considera¬ 
tions : we next require to know its probable error, and therefore an approximate solution 
of the distribution, in samples, of the statistic calculated. It may then become apparent 
that other statistics may be used as estimates of the same parameter. When the 
probable errors of these statistics are compared, it is usually found that, in large samples, 
one particular method of calculation gives a result less subject to random errors than 
those given by other methods of calculation. Attacking the problem more thoroughly, 
and calculating the surface of distribution of any two statistics, we may find that the 
whole of the relevant information contained in one is contained in the other : or, in 
other words, that when once we know the other, knowledge of the first gives us no 
further information as to the value of the parameter. Finally it may be possible to 
prove, as in the case of the Mean Square Error, derived from a sample of normal popula¬ 
tion (7), that a particular statistic summarises the whole of the information relevant 
to the corresponding parameter, which the sample contains. In such a case the problem 
of estimation is completely solved. 
