XXX11 



Tables for Statisticians and Biometricians 



[XII 



Short provisional tables of P were given in the article referred to and were 

 replaced in the following year by the present standard tables of Palin Elderton. 



In using the test for goodness of fit, due regard should be paid to the 

 conditions under which it is deduced. It is assumed that the frequencies form 

 a normal system of variates. This is legitimate only when in the binomial (p + q) n , 

 q is not very small as compared with p. If q be not very small as compared with p, 

 even for n finite, the binomial approaches closely to the normal curve. Accordingly 

 in using the test it is desirable to club together small frequencies at the tails of 

 curves or margins of surfaces. The difficulty becomes very obvious when theory 

 can go by fractions, but observations only by units. 



The theory can be extended to cover much ground in all sorts of sampling*. 



Illustration. The following data for observed frequencies of cephalic index in 

 Bavarian crania and for corresponding frequencies of a fitted Gaussian curve have 

 already been considered on p. xx. Test the goodness of fit. 



* A word of caution must be given about a recent extension by Slutsky (see Journal of Royal 

 Statistical Society, Vol. Lxxvn. pp. 78—84) who has applied it to test the goodness of fit of regression 

 curves. In such cases the means and standard deviations of each array should, I think, be deduced 

 from the theoretical surface, and the method would then agree with that illustrated on pp. xxiv— xxvi, 

 i.e. on the probability of a given complex of variates differing from the run of values of a given 

 population significantly. Slutsky after assuming that the observed frequencies and standard deviations 

 of the arrays may replace the theoretical values, deduces his P from Elderton's Tables instead of from 

 the incomplete normal moment tables. He finds for the fit of a straight regression line, used to predict 

 the probable price of rye at Samara from the price a month previously, x 2 = 22-2, giving P = -02, a bad 

 fit. Had he, however, used the theoretical standard-deviation of an array, i.e. oVl -?•'-', instead of the 

 very irregular observed standard deviations of individual arrays, he would have found x 2 =8-84, leading 

 to P=-64 an excellent fit, which would probably have been still further improved by the use of 

 theoretical total frequencies for the arrays, based, say, on a Gaussian distribution. 



