312 
MR. R. A. FISHER ON THE MATHEMATICAL 
the data is usually far greater than the number of facts sought, much of the information 
supplied by any actual sample is irrelevant. It is the object of the statistical processes 
employed in the reduction of data to exclude this irrelevant information, and to isolate 
the whole of the relevant information contained in the data. 
When we speak of the probability of a certain object fulfilling a certain condition, we 
imagine all such objects to be divided into two classes, according as they do or do not 
fulfil the condition. This is the only characteristic in them of which we take cognisance. 
For this reason probability is the most elementary of statistical concepts. It is a para¬ 
meter which specifies a simple dichotomy in an infinite hypothetical population, and it 
represents neither more nor less than the frequency ratio which we imagine suc-h a 
population to exhibit. For example, when we say that the probability of throwing a 
five with a die is one-sixth, we must not be taken to mean that of any six throws with 
that die one and one only null necessarily be a five ; or that of any six million 
throws, exactly one million will be fives ; but that of a hypothetical population of an 
infinite number of throws, with the die in its original condition, exactly one-sixth will 
be fives. Our statement will not then contain any false assumption about the actual 
die, as that it will not wear out with continued use, or any notion of approximation, as 
in estimating the probability from a finite sample, although this notion may be logically 
developed once the meaning of probability is apprehended. 
The concept of a discontinuous frequency distribution is merely an extension of that of 
a simple dichotomy, for though the number of classes into which the population is 
divided may be infinite, yet the frequency in each class bears a finite ratio to that of the 
whole population. In frequency curves, however, a second infinity is introduced. No 
finite sample has a frequency curve : a finite sample may be represented by a histogram, 
or by a frequency polygon, which to the eye more and more resembles a curve, as the 
size of the sample is increased. To reach a true curve, not only would an infinite number 
of individuals have to be placed in each class, but the number of classes (arrays) into 
which the population is divided must be made infinite. Consequently, it should be 
clear that the concept of a frequency curve includes that of a hypothetical infinite 
population, distributed according to a mathematical law, represented by the curve. 
This law is specified by assigning to each element of the abscissa the corresponding 
element of probability. Thus, in the case of the normal distribution, the probability 
of an observation falling in the range dx, is 
I _ ( r ~ 
-= e 20-2 dx, 
<r \ / 2- 
in which expression x is the value of the variate, while m, the mean, and a-, the standard 
deviation, are the two parameters by which the hypothetical population is specified. 
If a sample of n be taken from such a population, the data comprise n independent facts. 
The statistical process of the reduction of these data is designed to extract from them 
all relevant information respecting the values of m and n-, and to reject all other 
information as irrelevant. 
