698 APPENDIX 



However, the normal curves give, in general, at least a valuable first 

 approximation, and we shall follow the usual method of employing statis- 

 tical constants derived from this curve ; for these constants are significant, 

 even if the distribution is not normal. 1 



Area under the probability curve has an important meaning. If we select 

 unit area as explained in Section III, the area represents the total population, 

 and the area between the two ordinates, the curve, and .r-axis represents the 

 number of variates between these ordinates. If we look upon the curve as 

 representing probabilities instead of frequencies our horizontal scale is 

 unchanged but our vertical scale must be multiplied by the total popula- 

 tion. Thus, if the population is 800, as in the case of Fig. i, we should say 

 that what there represented unity should be multiplied by 800 in order that 

 it shall represent unit of probability. Then the entire area under the curve 

 will be unity, and the area between two ordinates, the curve, and jr-axis is 

 simply the probability that a variate selected at random would lie within 

 this interval. f . 



SECTION VII PROBABLE ERROR AND STANDARD 

 DEVIATION 



If we have estimated the population of a city at 100,000 and have 

 good reason to think that the chances are even that this is correct within 

 1000, we give much more information by stating that the population is 

 100,000 1000 than by giving merely the figures 100,000 and leaving the 

 reader entirely in doubt as to the accuracy of the determination. 



In describing a frequency distribution the average gives absolutely no 

 idea as to whether deviations are large or small, nothing in regard to the 

 spread of the distribution. It is the object of the "standard deviation" to 

 be descriptive of this variability, and it is the object of the so-called 

 " probable error " to indicate what confidence is to be placed in statistical 

 results. The use which has been made of both "standard deviation" and 

 probable error " makes it unnecessary to dwell longer on this point, but 

 it is our purpose here to show how the formulas used in the text are derived. 



Probable error of a single variate. The probable error of a single vat-tat e 

 of a population is defined as that departure from the mean, on either side, 

 within which exactly one half the variates are found. 



By the use of the probability curve (Fig. 6) the probable error may 

 easily be explained geometrically when we look upon the entire area under 

 the curve as representing the total population. In Fig. 6 we draw two 

 ordinates, ST and S'T*, equally distant from the mean, and such that one 

 half of the entire area under the curve lies between them, in other words, 

 is bounded by the curve, the .r-axis, ST, and S'T'. Then OS represents the 

 probable error of a single variate. If we should use a single variate selected 

 at random to represent the population it is an even chance that that single 

 variate would be less or more than OS from the best value. 



1 See Yule, Proceedings of the Royal Society, LX, 477-489. 



