42 
On the Poismn Law of Small Numhers 
and standard-deviation '^n2)q. These will, however, not be identical standard 
deviations as p is not truly unit}'. In ordinary practice, in testing for example the 
30 in 1000 frequenc}', we should put the centre of our Gaussian at our 30 group, 
and use a standard deviation = V30 (1-30/1000) = \/30 x "97 = 5-39444 to enter 
the table of the probability integral. This is, of course, the Gaussian we obtain 
by the method of least squares, but to assume that it is " the best " is to argue in 
a circle, because we then take least squares as a test of what is best *. It is 
not the Gaussian which is directly reached by proceeding either to a limit of the 
Binomial or to the Exponential, for example, by applying Stirling's Theorem. It 
will be seen by examining Table II that the Gaussian curve develops out of the 
exponential by a mode at the point midway between the two equal terms, rather 
than by a mode at the mean, which coincides with the centre of the second of 
them. If we apply Stirling's Theorem to the ternif 
n — r r 
of the binomial N(p + qy' it becomes 
u^, = e - 2 {'• - "2 + 4 {p - 1) ! -li"!''!), 
V 27r \/ npq 
i.e. the ordinate of a Gaussian curve of Standard Deviation ^npq and mean at 
nq — 2^ (j?^ — q)- These give for the Poisson-Exponential the Gaussian with standard- 
deviation V'Hi and mean 7n - The above type of curve which gives frequencies 
by coordinates and not by areas has been termed by Sheppard a ' spurious curve 
of frequency'; at the same time it is the method by which Laplace and Poisson 
first reached the normal curve, and the real point at issue is whether we shall get 
better approximations to the discontinuous frequencies of the binomials by using 
Gaussian ordinates than by using the areas of a Gaussian curve. At the .same 
time it has been shewnj that if a Gaussian curve gives a series of frequencies by 
its areas, then if its standard-deviation be a", a spurious Gaussian frequency curve 
with standard deviation given by o-„"— a" + ^^Jv, h being the sub-range, will closely 
give the frequencies by its ordinates. It seems probable therefore that the 
Gaussian curve with mean at nq — ^ {]) — q) and standard deviation V npq — 
will more closely represent the binomial for cell frequency variation by its areas, 
* There is a further flaw in this treatment — the Gaussian is continuous, the Binomial and the 
Poissou-Exponential are not. If be the )-th term of either of the latter series, we ought really to 
make 
1 X 
a minimum by the conditions dti,jdm = dulda=0. No complete solution of this problem has hitherto 
been determined. 
t The final form for may be obtained by neglecting the terms in ^ in the formula given by 
Pearson, Phil. Trans. Vol. 186, A, p. 347, footnote. 
X Biometrika, Vol. in. p. 311. 
