216 Oil the Probable Error of a Coefficient of Contingency 
These undetermined numbers are thus in general of the nature of weights and 
may be chosen in a variety of ways. The most important particular case is that 
of the population being grouped in a contingency table with, say, two variates. 
The sth division will be, say, the cell {u, v) and if be taken to be N ^2^ ' 
and /«y being as usual the marginal totals of the uth. row and vth column of the 
population M, <f)^ will be the mean square contingency*. Other cases will be 
discussed later. 
The object of the present paper is to investigate the variation of the quantity 
cf)^ as determined from the samples of the population. We take the numbers A 
to be a property of the whole population and accordingly to have no variation as 
long as the size of the samples is constant. It is true that in most cases in practice 
there will be only one sample and that the values of the numbers A will have to 
be deduced from that sample and will therefore deviate from the values which 
would be used if the sampled population were known. But what we are seeking 
is the variability of the samples on the understanding that the distribution of 
the whole population is definite although in practice we know only the approxima- 
tion to that distribution which is given by our sample. If we had wanted the 
variability of the calculafed values of 0'^ deduced from a large number of random 
samples, then we should have taken into account the variation of the A's as well 
as of the ns. In this lies the difference between the discussions in the two earlier 
papers of 1906 and 1914. 
This investigation follows that of the second paper, but we shall here give the 
full expressions without approximation, i.e. without neglecting the square of Sn^ 
as was done in 1914. It will appear from the numerical examples worked out 
later that this squared term makes a fairly great difference and, even if this were 
not so, it is always preferable to have such formulae in full in order to decide the 
legitimacy of neglecting any terms. This is especially the case in statistical theory 
where neglect of the later terms of a Taylor expansion often leads to false results. 
(2) Mean Value of 4>^. 
Let (^'"^ be the mean value of </)'^ and let be the mean value of n^, i.e. the 
value which would be given by taking a very large number of samples. Then we 
can write 
N ^M' 
Also if we define S^- and Sitg by the equations 
fis = ng + Srig, 
we have 
* Drapers'" Research Memoirs, Biomptric Series, I. On the Theory of Contingency, etc., Cambridge 
University Press, 1904; Biomctrika, Vol. v. p. 191, 1906; Biometrika, Vol. x. p. 570, 1914. 
