APPENDIX 



697 



familiar with the calculus, the derivation of this curve will be treated in 

 a footnote, 1 but a complete understanding of this footnote is not necessary 

 for reading what follows. 



While the equation of this curve has been derived in many different 

 ways by arguments based on a few assumptions as to the nature and causes 

 of deviations from the mean, it must be granted that these assumptions are 

 of such a nature that experiment is an important test as to whether a 

 frequency distribution is of this type. The reader should be on his guard 

 against the mistake that deviations from the mean, in all classes of measure- 

 ments, follow closely this law of frequency. 2 In fact, Pearson has found 

 that in many cases frequency distributions obtained in biological study can- 

 not be so well fitted by the normal curve as by what he calls generalized 

 probability curves, which take " skewness " and limit of range into account. 

 These curves lead us into mathematical complications which cannot be well 

 treated here, but it may be remarked that he obtains these curves from the 

 point binomial (p + q} n , where/ + q I, but p ^ g, and from a hypergeo- 

 metric series. 



1 While Gauss, Laplace, Quetelet, Herschel, and other great mathematicians have derived 

 the equation of the normal curve, and all agree in the result, they differ widely as to hypoth- 

 eses upon which they base the derivations. 



We present here a derivation based upon the hypothesis (see Pearson, Philosophical 

 Transactions, CLXXXVI, A, pp. 343-381) that the normal curve represents a function 

 y <f>(x) which has a certain slope condition obtained from the point binomial polygon 

 (i + i) m ( see Fig. 4)- This slope condition may be stated as follows : 



slope of side _ 2 mean abscissa of side 



mean ordinate of side 2 a 2 



the jy-axis being the axis of symmetry and the <r being the same for all sides. 

 In calculus form, this condition would be 



dy _ 2x 

 ydx~ 20-2' 



_ 

 Integrating, y = ke 2<r2 - 



The constant k can be determined by finding the total area under the curve and equating 

 this to the total population n which the area represents. This gives 



and the final form of the equation of the normal curve is 



in which <r will later be shown to be what we shall call the "standard deviation" and 

 e 2.718 , the base of Napierian logarithms. 



If equation (i) is to give probabilities instead of frequencies, we must replace n by i in 

 equation (i). 



2 For fulfillment of the normal law in nature, see Edgeworth, Statistical Journal, Jubilee 

 Number, 1885, p. 188. 



